cc-by (c) Jan Aerts, 2019-2023

This blog post acts as the hub for the vega-lite and vega tutorial taught at the EBI workshop Data Visualisation for Biology: A Practical Workshop on Design, Techniques and Tools as well as the course material for the Data Visualisation for Data Science course at UHasselt and KU Leuven.

In this tutorial, we will work in different phases, looking at vega-lite, vega, and observable. We’ll start with vega-lite and vega itself in the online editor that they provide, and move on to using observable at a later stage.

This tutorial is based on material provided on the vega-lite, vega and observablehq websites, as well as teaching material collected by Ryo Sakai.

Preamble

What are vega-lite and vega?

You might have heard of D3 as a library to create interactive data visualisations. Indeed, this library is considered the standard in the field. Unfortunately, it does have quite a steep learning curve which makes it not ideal if you have to learn it in 2-3 days without a background in javascript programming. In this course, we’ll use vega and vega-lite instead. Both are so-called declarative languages, where you tell the computer what you need, not how to do it. Vega and vega-lite are expressed in JSON (“javascript object notation”).

This image by Eric Marty provides a good overview how the different parts are related:

To give you an idea, here’s a small vega-lite script that shows a barchart.

{
  "$schema": "https://vega.github.io/schema/vega-lite/v4.json",
  "data": {
    "values": [
      {"a": "A", "b": 28}, {"a": "B", "b": 55}, {"a": "C", "b": 43}
    ]
  },
  "mark": "bar",
  "encoding": {
    "x": {"field": "b", "type": "quantitative"},
    "y": {"field": "a", "type": "ordinal"}
  }
}

The resulting barchart:

JSON

Let’s first have a look at the JSON format:

  • strings are put between double quotes "
  • numbers are not put between quotes
  • lists (aka arrays) are put between square brackets []
  • objects (aka hashes, aka dictionaries, aka key-value pairs) are put between curly brackets {}, and key and value are separated with a colon :

Also, these objects can be nested. In the example above, the whole thing is a single JSON object, consisting of key-value pairs (keys being "$schema", "data", "mark" and "encoding"). The "data" key itself holds an object { "values": [ ... ]}. The "values" key in its place is an array of objects again.

Different elements in a JSON object or array have to be separated by a comma ,.

The actual tutorials