This series of lectures aims to build foundational skills in modern programmatic data vizualization. There are three main parts:
- Exploratory dataviz with Python notebooks and chart libraries like Matplotlib, Seaborn and Plotly.
- Moving the results of those explorations onto the web with Plotly, a powerful dataviz tool.
Note that D3 is best thought of as a collection of modules or tools for such things as selecting and manipulating web-page elements (usually scalable vector graphics (SVG)), doing the mathematical book-keeping necessary for dataviz work (e.g. scales) and geographic mapping (a huge range of map forms available). We’ll cover some of the key ones but a short course can only scratch the surface.
Although many of you won’t end up programming dataviz for the web, this course should give you enough knowledge to know how it’s done and to interact with and direct those doing it.
- Proprietary tools like Tableau and Power-BI are increasingly sophisticated and would require a course in their own right.
- I think it’s better to understand the ecosystem from which they come and the foundations on which they rest.
- A first class general purpose programming language like Python, allied to Pandas, notebooks and inbuilt charting represents a very powerful expressive context for exploratory dataviz (programmers almost always get frustrated with the limitations of GUI driven software like Tableau, and with good reason).
- Both programming and dataviz are craft skills, learned in the doing. The best way to learn is to find a project that is meaningful to you and try and scratch that itch.
I’m treating these lectures as a continuous block of time with tasks to be done. Online learning seems to be more intense and tiring than its real-world counterpart so we’ll have regular breaks to recover and take stock.
Submission of projects
Submission of projects as Jupyter notebooks and/or codepen (or other sandbox) ‘pens’ is encouraged but not mandatory.
The work will be assessed on all programmatic aspects of the dataviz process, which include cleaning your data. The dirty truth of dataviz is that most datasets are full of errors and cleaning them a vital part of the process of dataviz. Pandas and Jupyter notebooks are a great way to do this (see Chapter 9 of my book for an example of the process).
All of these will be assessed:
- Cleaning datasets.
- Exploring with Jupyter and chart libraries such as Matplotlib and Seaborn.
- Static and Dynamic charts.
- ancillary/supplemental datasets fine, anything used in final visualization
If you have any queries just send me an email.
There will probably be some free time in the last lecture and I’d like to give you an opportunity to choose a focus or two. Here are some options:
- Modern JS (node-based, modular imports, much improved syntax)
- D3 in more depth
- Demo of more varied Matplotlib chart in exploratory dataviz
- Discussion of some general points about modern web dataviz
- Some issues thrown up by your projects
Feel free to vote by zoom message, send me an email, or open a group discussion on Moodle.
- craft a raw dataset into an insightful chart using Pandas and Matplotlib.
- convert this chart to an embedded web chart using Python’s Plotly library.
attribution: the original chart produced by John Burns Murdoch et. al. at the Financial times. Note that James Burns Murdoch uses D3 to produce all forms of his charts, print and web-based.
First we’ll build the chart using Matplotlib and Panda’s built in (Matplotlib based) plotting methods:
We’ll then take that as a basis for an interactive Plotly chart that can be embedded in a standard web-page:
- Burn-Murdoch criticism of John Hopkins https://twitter.com/jburnmurdoch/status/1263062077213749250
- Matplotlib line reference: https://matplotlib.org/gallery/lines_bars_and_markers/line_styles_reference.html
- review the Covid-19 Jupyter notebook
- look at Plotly and Mapbox maps
- create a browser-based presentation using HTML, the Covid chart, maps and CSS flex-box
- embed static image from Jupyter -> Dropbox -> Codepen
We’ll aim to produce a rough chart-based web presentation using a couple of Covid-19 Plotly charts, learning basic web-dev practice as we go along:
- review the Plotly chart-based web presentation.
- review flexboxes
- intro to SVG with face challenge:
- intro to D3 with D3 challenges
- Plotly charts
- Kaggle Datasets
- SVG Visual Cheatsheet
- Codepen UCL D3 Demo
- UCL Codepen Collection
- Codepen Projects
- D3 enter and exit patterns
- Mike Bostock’s Blocks
- D3 recap - mastering the select functor
- moving to Codepen project and out of Codepen sandbox
- Getting and manipulating data in JS - the powerful functional methods, map, filter and reduce..
- Recap JS data manipulation
- Loading data into a JS app
- Using that data to make a chart, after filtering
- enhance our Covid-19 web presentation:
- Covid-19 dataviz before
- Covid-19 dataviz after – !NOTE! - remove specific sizing.. – remove or resize margins!
deployting to surge.sh (see http://ucl-covid-19-dataviz.surge.sh/)
Questions about web dataviz, projects etc.