Introducing D3
D3 – also referred to as d3.js – is a JavaScript library for creating data visualization. But that kind of undersell it.
The abbreviation D3 reference the tool’s full name, Data-Driven Documents. The data is provided by you, and the documents are web-based documents, meaning anything that can be rendered by a web browser, such as HTML and SVG. D3 does the driviing, in the sense that it connects the data to the documents.
Data
Data is an extemely broad term, only slightly less vague than the nearly all-encompassing information.
Boradly speaking, data is a structured information with potential for meaning.
In the context of programming for visualization, data is stored in a digital file, typically in either text or binary form. Of course, potentially every piece of digital ephemera may be considered data
– not just text, but bits and bytes representing images, audio, video, database, streams, models, archives, and anything else.
Within the scope of D3 and browser-based visualization, however, we will limit ourselves to text-based. That is, anything that can be presented as numbers and strings of alpha charaters.
Binding Data
We use D3’s data()
method to bind data to DOM elements. But there are two things we need in place first, before we can bind data:
The data
A selection of DOM elements
Anytime after you call data()
, you can craete an anonymous function accepts d
as input.
The magical data()
method ensures that d
is set to the corresponding value in your original dataset, given the current element at hand.
Drawing with Data
.attr
, .classed
, .style
Scales
Scales are functions that map from an input domain to an output domain.
The values in any dataset are unlikely to correspond exactly to pixel measurement for use in your visuallization. Scales provide a convenient way to map those data values to new values useful for visuallization purposes.
D3 scales are functions with parameters that you define. Once they are created, you call the scale
function , pass it a data value and it nicely returns a scaled output value.
A scale is a mathematical relationship, with no direct visual output.
Domains and Ranges
A scale’s input domain is the range of possible input data values.
A scale’s output range is the range of possible output values, commonly used as display values in pixel units.
Normalization
Normalization is the process of mapping a numeric value to a new value between 0 and 1, based on the possible minumum and maximum values.
Creating a Scale
D3’s linear scale function generator is accessed with d3.scaleLiner()
.
1 | var scale = d3.scaleLinear() |
Now scale
is a function to which you can pass input values.
1 | scale(2.5) // return 2.5 |
Because we haven’t set a domain and a range yet, this function will map input to output on a 1:1
scale.
We set the scale’s input domain to 100, 500 by passing those values to the domain()
method as an array.
1 | scale.domain([100, 500]) |
Set the output range in similar fashion, with range()
1 | scale.range([10, 350]) |
These steps can be chained:
1 | var scale = d3.scaleLinear() |
Now it’s effective
1 | scale(100) // return 10 |
d3.max() and d3.min()
1 | var dataset = [ |
Setting Up Dynamic Scales
1 | var xScale = d3.scaleLinear() |
Output range is set to 0 and w, the SVG’s width.
Use the scale
1 | .attr('x', function (d) { |
d3.scaleLinear()
has several other handly methods that deserve a breif mention here:
nice()
: this tells the scale to take whatever input domain that you gave todomian()
and expand both ends to the nearest round value.rangeRound()
: UserrangeRound()
in place ofrange()
, and all values output by the scale will be rounded to nearest whole number.clamp()
: By default, a linear scale can return values outside of the specified range. For example, if if given a value outside of its expected input domain, a scale will return a number also outside of the output range. Callingclamp(true)
on a scale, forces all output values to be within the specified range. This means excessive values will be rounded to the range’s low or high value.
To use any of these special methods, just tack them onto the chain in which you define the original scale function.
1 | const scale = d3.scaleLinear().domain([0.123, 4.67]).range([0, 500]).nice() |
Other Scales
sclaeSqrt
scalePow
scaleLog
scaleQuantize
scaleQuantile
scaleOrdinal
scaleTime
Axes
D3’s axes are actually function
whose parameters you define. Unlike scales, when an axis function is called, it doesn’t return a value, but generates the visual elements of the axis, including lines, labels, and ticks.
Setting Up an Axis
There are four different axis function constructors, each one corresponding to a different orientation and placement of labels: d3.axisTop
, d3.axisBottom
, d3.axisLeft
, d3.axisRight
.
1 | const xAxis = d3.axisBottom() |
At a minimum, each axis also needs to be told on what scale to operate.
1 | const xAxis = d3.axisBottom().scale(xScale) |
Finally, to actually generate the axis and insert all those little lines and labels into our SVG, we must call
teh xAxis
function. This is similar to the scale functions, which we first configured by setting parameters, and then later called
to put them into action.
Here we put this code at the end of script, so the axis is generated after the other elements in the SVG, and therefore appears ‘on top’
1 | svg.append('g').call(xAxis) |
This is where things get a little funky. You might be wondering why this looks so different from our friendly scale functions. Here’s why: because an axis function actually draws something to the screen (by appending SVG elements to the DOM), we need to specify where in the DOM it should place new elements.
In the svg, we append()
a new g
element to the end of the SVG. In SVG land, a g
element is a group
element. Group elements are invisible, unlike line
, rect
, and circle
and they have no visual presence themselves.
The g
element can keep our code nice and be transformed.
So we’ve created a new g
, and then finally, the function call()
is called on our new g
.
D3’s call()
function takes the incoming selection
, as received from the prior link in the chain, and hands that selection off to any function
.
This this case, the selection is our new g
group element. Although the g
isn’t strictly necessary, we are using it because the axis function is about to generate lots of crazy lines and numbers, and it’s noce to contain all those element within a single group object.
call()
hands off g
to the xAxis
function, so our axis is genrated within g
Positioning Axes
By default, an axis is positioned using the range values of the specified scale. In our case, xAxis
is referencing xScale
, which has a range of [20, 460], because we applied 20 pixels of padding on all edge of the SVG. We can transform the axis by:
1 | svg.append('g') |
In the end, we would like our g
to look like this in the DOM:
1 | <g class="axis" transform="translate(0, 280)"> |
CSS styles on axis
The axes themselves are made up of path
, line
, text
element, so those are the three elemnts to target in your CSS.
The paths and lines can be styled together, with the same rules, and text gets its own rules around for font and font size.
1 | .axis path, .axis line { |
These CSS rules will override D3’s default styles.
Note that when we use CSS rules to style SVG elements, only SVG attribute names – not regular CSS properties – should be used.
Check for Ticks
Some ticks spread disease, but D3’s ticks communicates information.
You can customize all aspects of your axes, starting with the rough number of ticks, using ticks()
1 | const xAxis = d3.axisBottom().scale(xScale).ticks(6) // set rough # of ticks |
Specifying tick values manually
1 | const xAixs = d3.axisBottom().scale(xScale).tickValues([0, 100, 250, 700]) |
Formatting Tick labels
Enter tickFormat()
, which enables you to specify how your numbers should be formatted.
1 | const formatAsPercentage = d3.format(".1%") |
Updates, Transtions, and Motion
Transtion: .transition(), .duration()
Motion: .delay(), .ease()
The argument could be number or function
1 | .transition() |
The default easing is d3.easeCubicInOut
, which produces the gradual acceleration and deceleration.
1 | .transition() |
d3.easeCircleIn
d3.easeElasticOut
d3.easeBounceOut
on() Transition Starts and Ends
There will be times when you want to make something happen at the start or end of a transition. In those time you can use on()
to execute arbitrary code for each element in the selection.
on()
expects two arguments:
Either “start” or “end”
An anonymous function, to be executed either at the start of a transition, or as soon as it has ended.
Notice: Only one transition can be active on any given element at any given time.
Containing Visual Elements With Clipping Paths
SVG has support for clipping paths, which you might know as masks in many drawing tools.
Much like g
, clipPath
has no visual presence of its own, but it contains visual elements.
1 | <clipPath id="chart-area"> |
Note that outer clipPath
element has been given an ID of chart-area
. We can use that ID to reference it later. Within the clipPath
is a rect
, which will function as the mask.
3 steps to using a clipping path:
Define a
clipPath
and give it an IDPut visual elements within the
clipPath
(usuallly just arect
, but this could becircles
or any other visual elements)Add a reference to the
clipPath
form whatever elements you wish to be masked.
1 | svg.append('clipPath') |
Interactivity
D3 allows you to bind event listener to more than one element at a time.
1 | svg.selectAll('rect') |
Within anonymous function , D3 automatically sets the context of this
so it references current element upon which we are acting
1 | svg text { |
or
1 | svg.append('text').style('pointer-event', 'none') |
Grouping SVG Element
Note that g
group elements do not, by themselves, trigger any mouse events. The reason for this is that g
elements have no pixels.
You can still bind event listener to g
elements, just keep in mind that the elements within that g
will then behave as a group. If any of the enclosed elements are clicked or moused over, then the listener function will be activated.
Tooltips
In interactive visualizations, tooltips are small overlays that present data values. In many cases, it’s not necessary to label every individual data value in the default value, but that level of detail should still be accessible to users.
SVG Element Tooltips
1 | .on('mouseover', function (d) { |
Using Paths
Line Charts
Scale setup
1 | const xScale = d3.scaleTime() |
Define a line generator
1 | const lineGen = d3.line() |
The x and y accessors tell the line generator how to decide where
to place each point on the line.
1 | const svg = d3.select('body') |
Instead of using data()
to bind each value in our dataset
array to a different element, we use datum()
, the method for binding a single data value to a single element. The entire dataset
array is bound to the new path
we just created.
We set a d
attribute, passing in our line generator function as an argument.
Since the data has already been bound to the path
, the line generator simply grabs taht data. plots the pointers as we specified, and draws a line connecting them.
Dealing with Missing Data
Assume a -99.99 value is not a true measurement, namely one value is missed.
defined()
is just an configuration method, like x()
and y()
. If the result of its anonymous function is true, then that data value is included, or the value is excluded.
1 | const lineGen = d3.line() |
Area Charts
Areas are not too different from lines. If a line is a series of connected (x, y) points, then an area is just that same line, plus a second such line (usually a flat baseline), with all the space in between filled in.
1 | const area = d3.area() |
y0
represents the area’s baseline, which y1
represents the top.
Selections
A Closer Look at Selections
d3.select('body')
returns a Selection
, which consists of _groups
and _parents
So a selection contains two arrays: _groups
and _parents
_groups
contains yet another array, which itself contains a list of elements – ony one in this case, body
.
Let’s expand body
and will see a lot of properties associated with body
.
Selections are just very special objects generated and interpreted by D3. You will never manipulate a selection yourself – don’t bother trying to reach int _groups
– as that’s what all of D3’s selection methods are for.
d3.select()
and d3.selectAll()
operate at the page level, which means
1 | const allGroups = d3.selectAll('g') |
The select()
and selectAll()
statements create new selections, and hand those new selections off to the subsequent methods.
To minimize the confusion from old and new selections, Mike Bostock recommends using an indentation convetion of four space when the selection is unchanged, but only two when a new selection is returned.
select()
and append()
are methods that return new selections, attr()
does not, but merely relays whatever selection it just acted on.
Storing Selections
Selections are immutable. All selection methods that affect which elements are selected (or their order) return a new selection rather tahn modifying the current selection.
That is, once you make a selection, you can’t modify it. You can only make a new one, which could be a subset of the original, and overwrite it.
Enter, Merge and Exit
Filter
1 | d3.select('body').selectAll('p') |
filter()
takes a selection and an anonymous function. If the function return true
for a given element, then that element is included in the new selection returned by filter()
.
Each
The most common purpose of creating a selection is to ultimately to modify it in some way, such as by using attr()
or style()
. But it can be useful to define your own functions, espectially for custom calculation or modifications that will be repeated.
We can use each()
to run an arbitrary function once for each element in a selection. each()
takes whatever selection it’s given and calls the specified function once for each item in the selection.
1 | selection.each((d, i) => { |
Some serious points to note:
Since the
d
andi
arguments are specified in the function definition, D3 will populate them for you.The value of
this
will also be set by D3 to reflectthe element upon which we're currently acting
, Sod3.select(this)
will create a selection with whatever that element is.In
delay()
or any other function within (d, i) => {}, we can reference the valued
, andi
directly, no need to write(d, i)
Layouts
Contrary to what the name implies, D3 layouts do not, in fact, lay anything out for you on the screen. The layout methods have no direct visual output. Rather, D3 layouts take data that you provided and remap or otherwise transform it, thereby generating new data that is more convenient for a specific visual task.
The list of D3 layouts includes:
Chord
Cluster
Force
Pack
Partition
Pie
Stack
Tree
Treemap
Pie Layout
d3.pie()
might not be as delicious as it sounds, but it’s still worthy of your attention. It is typically used to create a doughnut or pie chart.
Start with simple dataset
1 | const dataset = [ 5, 10, 20, 45, 6, 25 ] |
Wd define a default pie layout
1 | const pie = d3.pie() |
Then, all that remains is to hand off our data to the new pie()
function, as in pie(dataset)
.
The pie layout takes our simple array of numbers and generates an array of objects, one object for each value. Each of those objects now has a few new values – most important, startAngle
and endAngle
.
To actually draw the wedges, we turn to d3.arc()
, a handy built-in function for drawing arcs as SVG path
elements.
Arcs are defiend as custom functions, and they require inner and outer redius values.
1 | const w = 300 |
1 | const svg = d3.select('body') |
Then we can create new groups for each incoming wedge, binding the pie-ified data to the new elements, and translating each group into the center of the chart, so the path
s will appear in the right place
1 | // set up groups |
Note that we’re saving a reference to each newly created g
in a variable called arcs
Finally, within each new g
, we append a path
. A path
‘s path description is defined in the d
attribute. So we call the arc
generator, which generates the path information based on the data already bound to this group.
1 | // draw arc paths |
arc
is our path generator function.
When a named function is specified as a parameter in this way, D3 automcatically passes in the datum and index values, without us having to write them out explicitly.
1 | .attr('d', arc) |
D3 has a number of handy ways to generate categorical colors
1 | const color = d3.scaleOrdinal(d3.schemeCategory10) |
Generate text labels for each wedge:
1 | arcs.append('text') |
A centroid is the calculated center point of any shape, whether that shape is regular(like a square) or highly irregular(like an outline of the state of Maryland).
arc.centroid()
is super-helpful function that calculates and returns the center point of any arc.
Note: The pie layout automatically reordered out data values from largest to smallest.
Stack Layout
d3.stack()
converts two-dimentional data into ‘stacked’ data.
It calculates a baseline value for each datum, so you can stack
layers of data on top of one another.
Start with some data
1 | const dataset = [ |
The dataset
is organized by column. The stack layout reconfigures the data to be organized by categories.
1 | // Set up stack method |
Now series
is an array with three values, each one itself an array corresponding to each categorical series: apples, oranges, and grapes(in that order because we specified as such in keys()
)
1 | //'series', the array formerly known as 'dataset' |
To stack element visually, now we can reference each data object’s baseline and top line values.
1 | // Add a group for each row of data |
Within each group, we select all the rect
s and bind a subset of the data with this line
1 | .data((d) => (d)) |
A New Order
Unless otherwise specified, series will be stacked in the order specified in keys()
.
But you can use order()
to specify an alternate sequence to be applied before all the stacked values are calculated.
1 | const stack = d3.stack() |
d3.stackOrderNone
d3.stackOrderReverse
d3.stackOrderAscending
d3.stackOrderDescending
Stacked Areas
This can be a valuable way to represent time series data, when a sum total is derived from several related categories.
Force Layout
Force-directed layouts are so-called because they use simulations of physical force to arrange elements on the screen.
Force layouts are typically used with network data. In computer science, this kind of dataset is called a graph
. A simple graph is a list of node
and edge
. The nodes are entities in the dataset, and the edges are the connection
s between nodes. Some nodes will be connected by edges and others won’t.
Nodea are commonly represented as circles and edges as lines.
The physical metaphor here is of particles that repel each other, yet are also connected by strings. The repelling forces push particles away from each other, preventing visual overlap, and the strings prevent them from just flying out into space, thereby keeping them on the screen where we can see them.
Preparing the network data
1 | var dataset = { |
Defining the Force Simulation
1 | // Initialize a simple force layout, using the nodes and edges in dataset |
Call d3.forceSimulation()
and pass in a reference to the nodes. This will generate a new simulator and automatically start running it, but without any forces applied, it won’t be very interesting.
To create force, call force()
as many times as you like, each time specifying an arbitrary name for each force(in case you want to reference it later) and the name of a force function.
d3.forceManyBody()
: Create a many-body force which acts on all nodes, meaning this can be used to either attract all nodes to each other or repel all nodes from each other. Try differentstrength()
values, and see what happens. The defaultstrength()
is -30, so we will see a slight repelling force.d3.forceLink()
: Our nodes are connected by edges, so we apply this force, specifyingdataset.edges
. Specify a targetdistance()
(the default is 30 pixles), and this force will struggle agains any competing forces to achieve that distance.d3.forceCenter()
: This force centers the entire simulation around whatever point you specify withx()
andy()
.
Creating the Visual Elements
After defining our force simulation, we proceed to generate visual elements.
First we create a line for each edge:
1 | // Create edges as lines |
Then create a circle for each node:
1 | const nodes = svg.selectAll('circle') |
Add simple tooltip:
1 | // Add a simple tooltip |
Updating Visuals Over Time
D3’s force simulation ‘ticks’ forward through time, just like every other physics simulation. With each tick, the simulation adjusts the position values for each node and edge according to the rules we specified when the layout was first initialized. To see this progress visually, we need to update the associated elements – the lines and circles – on every tick.
1 | // Every time the simulation 'ticks', this will be called |
This tells D3, ‘Ok, every time you tick, take the new (x, y) values for each line and circle and update them in the DOM’
Here the (x, y) are calculated and appeded by D3.
Draggable Nodes
Add the call()
statement to our Nodes
:
1 | .cal(d3.drag() |
This bit code ‘calls’ the d3.drag()
method on each node.
d3.drag()
, in turn, sets event listeners for the three drag-related events, and specifies functions to trigger whenever one of those events occurs.
1 | function dragStarted (d) { |
Geomapping
…