Introducing D3

D3 – also referred to as d3.js – is a JavaScript library for creating data visualization. But that kind of undersell it.

The abbreviation D3 reference the tool’s full name, Data-Driven Documents. The data is provided by you, and the documents are web-based documents, meaning anything that can be rendered by a web browser, such as HTML and SVG. D3 does the driviing, in the sense that it connects the data to the documents.

Data

Data is an extemely broad term, only slightly less vague than the nearly all-encompassing information.

Boradly speaking, data is a structured information with potential for meaning.

In the context of programming for visualization, data is stored in a digital file, typically in either text or binary form. Of course, potentially every piece of digital ephemera may be considered data – not just text, but bits and bytes representing images, audio, video, database, streams, models, archives, and anything else.

Within the scope of D3 and browser-based visualization, however, we will limit ourselves to text-based. That is, anything that can be presented as numbers and strings of alpha charaters.

Binding Data

We use D3’s data() method to bind data to DOM elements. But there are two things we need in place first, before we can bind data:

The data
A selection of DOM elements

Anytime after you call data(), you can craete an anonymous function accepts d as input.

The magical data() method ensures that d is set to the corresponding value in your original dataset, given the current element at hand.

Drawing with Data

.attr, .classed, .style

Scales

Scales are functions that map from an input domain to an output domain.

The values in any dataset are unlikely to correspond exactly to pixel measurement for use in your visuallization. Scales provide a convenient way to map those data values to new values useful for visuallization purposes.

D3 scales are functions with parameters that you define. Once they are created, you call the scale function , pass it a data value and it nicely returns a scaled output value.

A scale is a mathematical relationship, with no direct visual output.

Domains and Ranges

A scale’s input domain is the range of possible input data values.

A scale’s output range is the range of possible output values, commonly used as display values in pixel units.

Normalization

Normalization is the process of mapping a numeric value to a new value between 0 and 1, based on the possible minumum and maximum values.

Creating a Scale

D3’s linear scale function generator is accessed with d3.scaleLiner().

1	var scale = d3.scaleLinear()

Now scale is a function to which you can pass input values.

1	scale(2.5) // return 2.5

Because we haven’t set a domain and a range yet, this function will map input to output on a 1:1 scale.

We set the scale’s input domain to 100, 500 by passing those values to the domain() method as an array.

1	scale.domain([100, 500])

Set the output range in similar fashion, with range()

1	scale.range([10, 350])

These steps can be chained:

1
2
3

var scale = d3.scaleLinear()
  .domain([100, 500])
  .range([10, 350])

Now it’s effective

1
2
3

scale(100) // return 10
scale(300) // return 180
scale(500) // return 350

d3.max() and d3.min()

var dataset = [
  [5, 20], [480, 90], [250, 50], [100, 33], [330, 95],
  [410, 12], [475, 44], [25, 67], [85, 21], [220, 88]
]

d3.max(dataset, function (d) {
  return d[0]
})
// this function will return the value 480

Setting Up Dynamic Scales

var xScale = d3.scaleLinear()
  .domain([0, d3.max(dataset, function (d) {
    retrun d[0]
  })])
  .range([0, w])

Output range is set to 0 and w, the SVG’s width.

Use the scale

1
2
3

.attr('x', function (d) {
  return xScale(d[0])
})

d3.scaleLinear() has several other handly methods that deserve a breif mention here:

nice(): this tells the scale to take whatever input domain that you gave to domian() and expand both ends to the nearest round value.
rangeRound(): User rangeRound() in place of range(), and all values output by the scale will be rounded to nearest whole number.
clamp(): By default, a linear scale can return values outside of the specified range. For example, if if given a value outside of its expected input domain, a scale will return a number also outside of the output range. Calling clamp(true) on a scale, forces all output values to be within the specified range. This means excessive values will be rounded to the range’s low or high value.

To use any of these special methods, just tack them onto the chain in which you define the original scale function.

1	const scale = d3.scaleLinear().domain([0.123, 4.67]).range([0, 500]).nice()

Other Scales

sclaeSqrt
scalePow
scaleLog
scaleQuantize
scaleQuantile
scaleOrdinal
scaleTime

Axes

D3’s axes are actually function whose parameters you define. Unlike scales, when an axis function is called, it doesn’t return a value, but generates the visual elements of the axis, including lines, labels, and ticks.

Setting Up an Axis

There are four different axis function constructors, each one corresponding to a different orientation and placement of labels: d3.axisTop, d3.axisBottom, d3.axisLeft, d3.axisRight.

1	const xAxis = d3.axisBottom()

At a minimum, each axis also needs to be told on what scale to operate.

1
2
3

const xAxis = d3.axisBottom().scale(xScale)
// or
const xAxis = d3.axisBottom(xScale)

Finally, to actually generate the axis and insert all those little lines and labels into our SVG, we must call teh xAxis function. This is similar to the scale functions, which we first configured by setting parameters, and then later called to put them into action.

Here we put this code at the end of script, so the axis is generated after the other elements in the SVG, and therefore appears ‘on top’

1	svg.append('g').call(xAxis)

This is where things get a little funky. You might be wondering why this looks so different from our friendly scale functions. Here’s why: because an axis function actually draws something to the screen (by appending SVG elements to the DOM), we need to specify where in the DOM it should place new elements.

In the svg, we append() a new g element to the end of the SVG. In SVG land, a g element is a group element. Group elements are invisible, unlike line, rect, and circle and they have no visual presence themselves.

The g element can keep our code nice and be transformed.

So we’ve created a new g, and then finally, the function call() is called on our new g.

D3’s call() function takes the incoming selection, as received from the prior link in the chain, and hands that selection off to any function.

This this case, the selection is our new g group element. Although the g isn’t strictly necessary, we are using it because the axis function is about to generate lots of crazy lines and numbers, and it’s noce to contain all those element within a single group object.

call() hands off g to the xAxis function, so our axis is genrated within g

Positioning Axes

By default, an axis is positioned using the range values of the specified scale. In our case, xAxis is referencing xScale, which has a range of [20, 460], because we applied 20 pixels of padding on all edge of the SVG. We can transform the axis by:

svg.append('g')
  .attr('class', 'axis')
  .attr('tranform', 'translate(0, ' + (h - padding) + ')')
  .call(xAxis)

In the end, we would like our g to look like this in the DOM:

1	<g class="axis" transform="translate(0, 280)">

CSS styles on axis

The axes themselves are made up of path, line, text element, so those are the three elemnts to target in your CSS.

The paths and lines can be styled together, with the same rules, and text gets its own rules around for font and font size.

.axis path, .axis line {
  stroke: teal;
  shape-rendering: crispEdges;
}

.axis text {
  font-family: Optima, Futura, sans-serif;
  font-weight: bold;
  font-size: 14px;
  fill: teal;
}

These CSS rules will override D3’s default styles.

Note that when we use CSS rules to style SVG elements, only SVG attribute names – not regular CSS properties – should be used.

Check for Ticks

Some ticks spread disease, but D3’s ticks communicates information.

You can customize all aspects of your axes, starting with the rough number of ticks, using ticks()

1	const xAxis = d3.axisBottom().scale(xScale).ticks(6) // set rough # of ticks

Specifying tick values manually

1	const xAixs = d3.axisBottom().scale(xScale).tickValues([0, 100, 250, 700])

Formatting Tick labels

Enter tickFormat(), which enables you to specify how your numbers should be formatted.

1 2	const formatAsPercentage = d3.format(".1%") xAxis.tickFormat(formatAsPercentage)

Updates, Transtions, and Motion

Transtion: .transition(), .duration()

Motion: .delay(), .ease()

The argument could be number or function

1
2
3

.transition()
.delay((d, i) => 100 * i)
.duration(300)

The default easing is d3.easeCubicInOut, which produces the gradual acceleration and deceleration.

1
2
3

.transition()
.duration(3000)
.ease(d3.easeLinear)

d3.easeCircleIn
d3.easeElasticOut
d3.easeBounceOut

on() Transition Starts and Ends

There will be times when you want to make something happen at the start or end of a transition. In those time you can use on() to execute arbitrary code for each element in the selection.

on() expects two arguments:

Either “start” or “end”
An anonymous function, to be executed either at the start of a transition, or as soon as it has ended.

Notice: Only one transition can be active on any given element at any given time.

Containing Visual Elements With Clipping Paths

SVG has support for clipping paths, which you might know as masks in many drawing tools.

Much like g, clipPath has no visual presence of its own, but it contains visual elements.

1
2
3

<clipPath id="chart-area">
  <rect x="30" y="30" width="410" height="240"></rect>
</clipPath>

Note that outer clipPath element has been given an ID of chart-area. We can use that ID to reference it later. Within the clipPath is a rect, which will function as the mask.

3 steps to using a clipping path:

Define a clipPath and give it an ID
Put visual elements within the clipPath (usuallly just a rect, but this could be circles or any other visual elements)
Add a reference to the clipPath form whatever elements you wish to be masked.

svg.append('clipPath')
  .attr('id', 'chart-area') // assign an id
  .append('rect') // within the clippath, create new rect
  .attr('x', padding)
  .attr('y', padding)
  .attr('width', w - padding * 2)
  .attr('height', h - padding * 2)

svg.append('g')
  .attr('id', 'circles')
  .attr('clip-path', 'url(#chart-area)')
  .selectAll('circle')
  .data(dataset)
  .enter()
  .append('circle')

Interactivity

D3 allows you to bind event listener to more than one element at a time.

svg.selectAll('rect')
  .data(dataset)
  .enter()
  .append('rect')
  ....
  .on('click', (d) => {
    // ...
  })

Within anonymous function , D3 automatically sets the context of this so it references current element upon which we are acting

1
2
3

svg text {
  pointer-event: none;
}

1	svg.append('text').style('pointer-event', 'none')

Grouping SVG Element

Note that g group elements do not, by themselves, trigger any mouse events. The reason for this is that g elements have no pixels.

You can still bind event listener to g elements, just keep in mind that the elements within that g will then behave as a group. If any of the enclosed elements are clicked or moused over, then the listener function will be activated.

Tooltips

In interactive visualizations, tooltips are small overlays that present data values. In many cases, it’s not necessary to label every individual data value in the default value, but that level of detail should still be accessible to users.

SVG Element Tooltips

.on('mouseover', function (d) {
  const xPosition = parseFloat(d3.select(this)).attr('x') + xScale.bandWidth()
  const yPosition = parseFloat(d3.select(this)).attr('y') + 14
})

// create the tooltip label

svg.append('text')
  .attr('id', 'tooltip')
  .attr('x', xPosition)
  .attr('y', yPosition)
  .attr('text-anchor', 'middle')
  .attr('font-family', 'sans-serif')
  .attr('font-size', '11px')
  .attr('font-weight', 'bold')
  .attr('fill', 'black')
  .text(d)

.on('mouseout', function () {
  d3.select('#tooltip').remove()
})

Using Paths

Line Charts

Scale setup

const xScale = d3.scaleTime()
  .domain([
    d3.min(dataset, (d) => d.date),
    d3.max(dataset, (d) => da.date)
  ])
  .range([0, w])

const yScale = d3.scaleLinear()
  .domain([
    0,
    d3.max(dataset, (d) => d.avarage)
  ])
  .range([h, 0])

Define a line generator

1
2
3

const lineGen = d3.line()
  .x(d => xScale(d.date))
  .y(d => yScale(d.avarage))

The x and y accessors tell the line generator how to decide where to place each point on the line.

const svg = d3.select('body')
  .append('svg')
  .attr('width', w)
  .attr('height', h)

svg.append('path')
  .datum(dataset)
  .attr('class', 'line')
  .attr('d', lineGen)

Instead of using data() to bind each value in our dataset array to a different element, we use datum(), the method for binding a single data value to a single element. The entire dataset array is bound to the new path we just created.

We set a d attribute, passing in our line generator function as an argument.

Since the data has already been bound to the path, the line generator simply grabs taht data. plots the pointers as we specified, and draws a line connecting them.

Dealing with Missing Data

Assume a -99.99 value is not a true measurement, namely one value is missed.

defined() is just an configuration method, like x() and y(). If the result of its anonymous function is true, then that data value is included, or the value is excluded.

const lineGen = d3.line()
  .defined(d => (d.avarage > 0))
  .x(d => x.date)
  .y(d => x.avatarge)

Area Charts

Areas are not too different from lines. If a line is a series of connected (x, y) points, then an area is just that same line, plus a second such line (usually a flat baseline), with all the space in between filled in.

const area = d3.area()
  .defined(d => (d.avarage > 0))
  .x(d => (xScale(d.date)))
  .y0(() => (yScale.range()[0]))
  .y1((d) => (yScale(d.avarge)))

y0 represents the area’s baseline, which y1 represents the top.

Selections

A Closer Look at Selections

d3.select('body') returns a Selection, which consists of _groups and _parents

So a selection contains two arrays: _groups and _parents

_groups contains yet another array, which itself contains a list of elements – ony one in this case, body.

Let’s expand body and will see a lot of properties associated with body.

Selections are just very special objects generated and interpreted by D3. You will never manipulate a selection yourself – don’t bother trying to reach int _groups – as that’s what all of D3’s selection methods are for.

d3.select() and d3.selectAll() operate at the page level, which means

1	const allGroups = d3.selectAll('g')

The select() and selectAll() statements create new selections, and hand those new selections off to the subsequent methods.

To minimize the confusion from old and new selections, Mike Bostock recommends using an indentation convetion of four space when the selection is unchanged, but only two when a new selection is returned.

select() and append() are methods that return new selections, attr() does not, but merely relays whatever selection it just acted on.

Storing Selections

Selections are immutable. All selection methods that affect which elements are selected (or their order) return a new selection rather tahn modifying the current selection.

That is, once you make a selection, you can’t modify it. You can only make a new one, which could be a subset of the original, and overwrite it.

Enter, Merge and Exit

Filter

d3.select('body').selectAll('p')
  .data(dataset)
  .enter()
  .append('p')
  .text(d => ('I can count up to ' + d))
  .filter(d => (d > 15))
  .style('color', 'red')

filter() takes a selection and an anonymous function. If the function return true for a given element, then that element is included in the new selection returned by filter().

Each

The most common purpose of creating a selection is to ultimately to modify it in some way, such as by using attr() or style(). But it can be useful to define your own functions, espectially for custom calculation or modifications that will be repeated.

We can use each() to run an arbitrary function once for each element in a selection. each() takes whatever selection it’s given and calls the specified function once for each item in the selection.

1
2
3

selection.each((d, i) => {
  // ...
})

Some serious points to note:

Since the d and i arguments are specified in the function definition, D3 will populate them for you.
The value of this will also be set by D3 to reflect the element upon which we're currently acting, So d3.select(this) will create a selection with whatever that element is.
In delay() or any other function within (d, i) => {}, we can reference the value d, and i directly, no need to write (d, i)

Layouts

Contrary to what the name implies, D3 layouts do not, in fact, lay anything out for you on the screen. The layout methods have no direct visual output. Rather, D3 layouts take data that you provided and remap or otherwise transform it, thereby generating new data that is more convenient for a specific visual task.

The list of D3 layouts includes:

Chord
Cluster
Force
Pack
Partition
Pie
Stack
Tree
Treemap

Pie Layout

d3.pie() might not be as delicious as it sounds, but it’s still worthy of your attention. It is typically used to create a doughnut or pie chart.

Start with simple dataset

1	const dataset = [ 5, 10, 20, 45, 6, 25 ]

Wd define a default pie layout

1	const pie = d3.pie()

Then, all that remains is to hand off our data to the new pie() function, as in pie(dataset).

The pie layout takes our simple array of numbers and generates an array of objects, one object for each value. Each of those objects now has a few new values – most important, startAngle and endAngle.

To actually draw the wedges, we turn to d3.arc(), a handy built-in function for drawing arcs as SVG path elements.

Arcs are defiend as custom functions, and they require inner and outer redius values.

const w = 300
const h = 300

const outerRadius = w / 2
const innerRadius = 0

const arc = d3.arc()
  .innerRadius(innerRadius)
  .outerRadius(outerRadius)

const svg = d3.select('body')
  .append('svg')
  .attr('width', w)
  .attr('height', h)

Then we can create new groups for each incoming wedge, binding the pie-ified data to the new elements, and translating each group into the center of the chart, so the paths will appear in the right place

// set up groups
const arcs = svg.selectAll('g.arc')
  .data(pie(dataset))
  .enter()
  .append('g')
  .attr('class', 'arc')
  .attr('transform', 'translate(' + outerRadius + ', ' + outerRadius + ')')

Note that we’re saving a reference to each newly created g in a variable called arcs

Finally, within each new g, we append a path. A path‘s path description is defined in the d attribute. So we call the arc generator, which generates the path information based on the data already bound to this group.

// draw arc paths
arcs.append('path')
  .attr('fill', (d, i) => {
    return colors(i)
  })
  .attr('d', arc)

arc is our path generator function.

When a named function is specified as a parameter in this way, D3 automcatically passes in the datum and index values, without us having to write them out explicitly.

.attr('d', arc)

// equivalent

.attr('d', (d, i) => (arc(d, i)))

D3 has a number of handy ways to generate categorical colors

1	const color = d3.scaleOrdinal(d3.schemeCategory10)

Generate text labels for each wedge:

arcs.append('text')
  .artr('transform', (d) => {
    return 'translate(' + arc.centroid(d) + ')'
  })
  .attr('text-anchor', 'middle')
  .text((d) => { return d.value })

A centroid is the calculated center point of any shape, whether that shape is regular(like a square) or highly irregular(like an outline of the state of Maryland).

arc.centroid() is super-helpful function that calculates and returns the center point of any arc.

Note: The pie layout automatically reordered out data values from largest to smallest.

Stack Layout

d3.stack() converts two-dimentional data into ‘stacked’ data.

It calculates a baseline value for each datum, so you can stack layers of data on top of one another.

Start with some data

const dataset = [
  { apples: 5, oranges: 10, grapes: 22 },
true{ apples: 4, oranges: 12, grapes: 28 },
true{ apples: 2, oranges: 19, grapes: 32 },
true{ apples: 7, oranges: 23, grapes: 35 },
true{ apples: 23, oranges: 17, grapes: 43 }
]

The dataset is organized by column. The stack layout reconfigures the data to be organized by categories.

// Set up stack method
const stack = d3.stack().keys(['apples', 'oranges', 'grapes'])

// Data, stacked
const series = stack(dataset)

Now series is an array with three values, each one itself an array corresponding to each categorical series: apples, oranges, and grapes(in that order because we specified as such in keys())

//'series', the array formerly known as 'dataset'
[
true[ [ 0,  5], [ 0,  4], [ 0,  2], [ 0,  7], [ 0, 23] ],  // apples
true[ [ 5, 15], [ 4, 16], [ 2, 21], [ 7, 30], [23, 40] ],  // oranges
true[ [15, 37], [16, 44], [21, 53], [30, 65], [40, 83] ]   // grapes
]

To stack element visually, now we can reference each data object’s baseline and top line values.

// Add a group for each row of data
const groups = svg.selectAll('g')
  .data(series)
  .enter()
  .append('g')
  .style('fill', (di, i) => {
    return colors(i)
  })

// Add a rect for each data value
const rects = groups.selectAll('rect')
  .data((d) => (d))
  .enter()
  .append('rect')
  .attr('x', (d, i) => {
    retrun xScale(i)
  })
  .attr('y', (d) => {
    return ySclae(d[0])
  })
  .attr('height', (d) => {
    return yScale(d[1]-d[0])
  })
  .attr('width', xScale.bandWidth())

Within each group, we select all the rects and bind a subset of the data with this line

1	.data((d) => (d))

A New Order

Unless otherwise specified, series will be stacked in the order specified in keys().

But you can use order() to specify an alternate sequence to be applied before all the stacked values are calculated.

1
2
3

const stack = d3.stack()
  .keys([ 'apples', 'oranges', 'grapes' ])
  .order(d3.stackOrderDesending)

d3.stackOrderNone
d3.stackOrderReverse
d3.stackOrderAscending
d3.stackOrderDescending

Stacked Areas

This can be a valuable way to represent time series data, when a sum total is derived from several related categories.

Force Layout

Force-directed layouts are so-called because they use simulations of physical force to arrange elements on the screen.

Force layouts are typically used with network data. In computer science, this kind of dataset is called a graph. A simple graph is a list of node and edge. The nodes are entities in the dataset, and the edges are the connections between nodes. Some nodes will be connected by edges and others won’t.

Nodea are commonly represented as circles and edges as lines.

The physical metaphor here is of particles that repel each other, yet are also connected by strings. The repelling forces push particles away from each other, preventing visual overlap, and the strings prevent them from just flying out into space, thereby keeping them on the screen where we can see them.

Preparing the network data

var dataset = {
truenodes: [
truetrue{ name: "Adam" },
truetrue{ name: "Bob" },
truetrue{ name: "Carrie" },
truetrue{ name: "Donovan" },
truetrue{ name: "Edward" },
truetrue{ name: "Felicity" },
truetrue{ name: "George" },
truetrue{ name: "Hannah" },
truetrue{ name: "Iris" },
truetrue{ name: "Jerry" }
true],
trueedges: [
truetrue{ source: 0, target: 1 },
truetrue{ source: 0, target: 2 },
truetrue{ source: 0, target: 3 },
truetrue{ source: 0, target: 4 },
truetrue{ source: 1, target: 5 },
truetrue{ source: 2, target: 5 },
truetrue{ source: 2, target: 5 },
truetrue{ source: 3, target: 4 },
truetrue{ source: 5, target: 8 },
truetrue{ source: 5, target: 9 },
truetrue{ source: 6, target: 7 },
truetrue{ source: 7, target: 8 },
truetrue{ source: 8, target: 9 }
true]
};

Defining the Force Simulation

// Initialize a simple force layout, using the nodes and edges in dataset
const layout = d3.forceSimulation(dataset.nodes)
  .force('charge', d3.forceManyBody())
  .force('link', d3.forceLink(dataset.edges))
  .force('center', d3.forceCenter().x(w/2).y(h/2))

Call d3.forceSimulation() and pass in a reference to the nodes. This will generate a new simulator and automatically start running it, but without any forces applied, it won’t be very interesting.

To create force, call force() as many times as you like, each time specifying an arbitrary name for each force(in case you want to reference it later) and the name of a force function.

d3.forceManyBody(): Create a many-body force which acts on all nodes, meaning this can be used to either attract all nodes to each other or repel all nodes from each other. Try different strength() values, and see what happens. The default strength() is -30, so we will see a slight repelling force.
d3.forceLink(): Our nodes are connected by edges, so we apply this force, specifying dataset.edges. Specify a target distance() (the default is 30 pixles), and this force will struggle agains any competing forces to achieve that distance.
d3.forceCenter(): This force centers the entire simulation around whatever point you specify with x() and y().

Creating the Visual Elements

After defining our force simulation, we proceed to generate visual elements.

First we create a line for each edge:

// Create edges as lines
const edges = svg.selectAll('line')
  .data(dataset.edges)
  .enter()
  .append('line')
  .style('stroke', '#ccc')
  .style('stroke-width', 1)

Then create a circle for each node:

const nodes = svg.selectAll('circle')
  .data(dataset.nodes)
  .enter()
  .append('circle')
  .attr('r', 10)
  .style('fill', (d, i) => {
    return colors(i)
  })

Add simple tooltip:

// Add a simple tooltip

nodes.append('title')
  .text((d) => {
    return d.name
  })

Updating Visuals Over Time

D3’s force simulation ‘ticks’ forward through time, just like every other physics simulation. With each tick, the simulation adjusts the position values for each node and edge according to the rules we specified when the layout was first initialized. To see this progress visually, we need to update the associated elements – the lines and circles – on every tick.

// Every time the simulation 'ticks', this will be called
force.on('tick', () => {
  edges.attr('x1', d => (d.source.x))
    .attr('y1', d => (d.source.y))
    .attr('x2', d => (d.target.x))
    .attr('y2', d => (d.target.y))

  nodes.attr('cx', d => (d.x))
    .attr('cy', d => (d.y))
})

This tells D3, ‘Ok, every time you tick, take the new (x, y) values for each line and circle and update them in the DOM’

Here the (x, y) are calculated and appeded by D3.

Draggable Nodes

Add the call() statement to our Nodes:

.cal(d3.drag()
  .on('start', dragStarted)
  .on('drag', dragging)
  .on('end', dragEnded))

This bit code ‘calls’ the d3.drag() method on each node.

d3.drag(), in turn, sets event listeners for the three drag-related events, and specifies functions to trigger whenever one of those events occurs.

function dragStarted (d) {
  if (!d3.event.active) force.alphaTarget(0.3).restart()
  d.fx = d.x
  d.fy = d.y
}

function dragging (d) {
  d.fx = d3.event.x
  d.fy = d3.event.y
}

function dragEnded (d) {
  if (!d3.event.active) force.alphaTarget(0)
  d.fx = null
  d.fy = null
}