Categorical colours
In the last exercise, we defined the colour of each bar in the data itself. It’d be better if we can let vega pick the colour for us instead. This is where colour scales come into play.
Vega helps you in assigning colours to datapoints, using a collection of colouring schemes available at https://vega.github.io/vega/docs/schemes/. Instead of hard-coding the colour for each datapoint, let’s give each datapoint a category and create a scale to assign colours.
{
"$schema": "https://vega.github.io/schema/vega/v5.json",
"width": 400,
"height": 200,
"padding": 5,
"data": [
{
"name": "table",
"values": [
{"x": 15, "y": 8, "category": "A"},
{"x": 72, "y": 25, "category": "B"},
{"x": 35, "y": 44, "category": "C"},
{"x": 44, "y": 29, "category": "A"},
{"x": 24, "y": 20, "category": "B"}
]
}
],
"scales": [
{
"name": "xscale",
"domain": {"data": "table", "field": "x"},
"range": "width"
},
{
"name": "yscale",
"domain": {"data": "table", "field": "y"},
"range": "height"
},
{
"name": "colourScale",
"type": "ordinal",
"domain": {"data": "table", "field": "category"},
"range": {"scheme": "category10"}
}
],
"marks": [
{
"type": "symbol",
"from": {"data":"table"},
"encode": {
"enter": {
"x": {"scale": "xscale", "field": "x"},
"y": {"scale": "yscale", "field": "y"},
"size": {"value": 200},
"fill": {"scale": "colourScale", "field": "category"}
}
}
}
]
}
What we did here:
- We added a “category” to each datapoint. This does not have to be named “category”, but can have any name.
- We created a new scale, called “colourScale”. The
name
,type
anddomain
are as we described above, but for therange
we set{"scheme": "category10"}
.category10
is only one of the possible colour schemes, which are all listed on https://vega.github.io/vega/docs/schemes/. - For
fill
we now use the scale as well. - Just to make the colour a bit more clear, we increased the size of the points…
The resulting plot:
As you can see, the points with the same category get the same colour.
Sequential colours
What if we’d want to have the colour depend not on a nominal value such as category, but on a numerical value? Let’s say, on x
.
Change the colourScale
to:
{
"name": "colourScale",
"type": "linear",
"domain": {"data": "table", "field": "x"},
"range": {"scheme": "blues"}
}
and change the field
in encoding
-> fill
from category
to x
.
You’ll get this image:
Exercise - We use x
both in the definition of colourScale
and as the field
in the encoding
. What would it mean if we’d use x
in the definition of colourScale
, but y
in the encoding
?
Exercise - Try out some of the diverging colour schemes mentioned on https://vega.github.io/vega/docs/schemes/.
If you really want to, you can also set the colours by hand. For example, we want to have categories A and B be blue, and category C be red. To do this, we simply provide an array both for domain
and range
, like so:
{
"name": "colourScale",
"type": "ordinal",
"domain": ["A","B","C"],
"range": ["blue","blue","red"]
}