Schedule

Social Data Science with R

  • Daniel Anderson
  • Brendan Cullen
  • Ouafaa Hmaddi

In addition to the course books linked below, I am currently working on course notes for all of the classes. This may eventually turn into a published book, but for now I’m just thinking of it as course notes. Please feel free to use these as a supplemental resource (in addition to the slides and lectures). Also, please let me know if you’d like to contribute! In particular, if you see typos or areas that are unclear, that feedback would be really helpful. You can fork the repo and submit a PR and be a contributor to the book!

Course Books

Each of the below links to the full book. Icons in the schedule link to specific chapters.

Week 1

Topics
Slides
Assigned
Due
Reading
Lecture

01-04
Introduction
Weekly schedule and topics. Introductions. A little bit of git and GitHub.


01-06
Upping your git skills
We’ll focus the whole day on git, solidifying the skills you built in the first class while also introducing new topics: branching, stashing, pull requests, and managing projects with multiple team members while avoiding merge conflicts.

Week 2

Topics
Slides
Assigned
Due
Reading
Lecture

01-11
Intro to visualizations
We will spend most of the day looking at different types of visualizations, with a specific focus on continuous variables. We will explore how different choices with these visualizations can change your inferences.


01-13
Lab 1
We will apply the skills learned in the previous lecture by trying out different bins and bandwidths for histograms and density plots. We’ll also replicate a plot (with ggplot) showing different amounts by group.

Week 3

Topics
Slides
Assigned
Due
Reading
Lecture

01-18
Martin Luther King Jr. Day
No class. Black Lives Matter.


01-20
Joins
Many of the best visualizations require joining data from diverse sources. In this lecture we’ll discuss keys for joining data, different types of mutating and filtering joins, and common errors in joining data

Week 4

Topics
Slides
Assigned
Due
Reading
Lecture

01-25
Visual Perception
Aesthetic mappings and visual encodings of data. The data-ink ratio and the pitfall of rigid rules. Some general rule of thumb recommendations.


01-27
Lab 2
We will use ggplot2 to replicate plots produced by fivethirtyeight.

Week 5

Topics
Slides
Assigned
Due
Reading
Lecture

02-01
Color
Three primary means by which color can aid interpretation. Color blindness considerations and color palettes that work well. Common pitfalls with the use of color.


02-03
Lab 3
We will use color for each of its primary uses in data visualization and explore and evaluate different palettes by different types of color blindness.

Week 6

Topics
Slides
Assigned
Due
Reading
Lecture

02-08
Communication
Refining your plots for communication. We’ll discuss annotating plots, aspect ratios, scales, and a bit on theming.


02-10
Uncertainty 1
As it turned out, we didn’t get very far here, so we’ll do it again!

Week 7

Topics
Slides
Assigned
Due
Reading
Lecture

02-15
Dashboards: Guest Lecturer Akhila Nekkanti
Building (static) data dashboards with the {flexdashboard} package. We’ll discuss layouts, including multi-page layouts, storyboards, icons, and publishing through GitHub.


02-17
Uncertainty 2
Common methods for visualizing uncertainty (and their implementation w/{ggplot2}). Framing uncertainty as relative frequencies. Non-standard methods for visualizing standard errors, boostrapping, and a brief foray into hypothetical outcomes plots

Week 8

Topics
Slides
Assigned
Due
Reading
Lecture

02-22
Tables and fonts
We will focus primarily on two packages for creating tables: {gt} for static tables, and {reactable} for interactive tables. We’ll also discuss changing fonts, both within websites/applications, as well as with {ggplot2}.


02-24
Websites
This lecture will focus on the {distill} package, which helps you createrelatively simple yet customizable blogs, optimized for scientific communication. In addition to setting up the site and creating posts, we’ll focus specifically on different layouts for displaying visual information (e.g., plots, tables)

Week 9

Topics
Slides
Assigned
Due
Reading
Lecture

03-01
Intro to Geographic data
Understanding the difference between vector and raster data, producing basic maps, getting data for producing different types of maps, and understandin the basics of the R geospatial ecosystem (which is consistently and rapidly evolving).


03-03
Customizing web pages
We’ll talk about customizations generally, introducing styling with CSS, but focus speifically on customizations to websites and dashboards.

Week 10

Topics
Slides
Assigned
Due
Reading
Lecture

03-08
Presentations
Half the class will present on their final project, with those who are selected being randomly chosen.


03-10
Presentations
All remaining students will present their final project.

Week 11

Topics
Slides
Assigned
Due
Reading
Lecture

03-17
Finals Week
Your final project is due before midnight