+ - 0:00:00
Notes for current slide
Notes for next slide

Welcome!

An overview of the course

Daniel Anderson

Week 1, Class 1

1 / 50

Agenda

  • Getting on the same page
  • Syllabus
  • A little bit of git/GitHub

2 / 50

whoami

  • Research Assistant Professor: Behavioral Research and Teaching
  • Dad (two daughters: 8 and 6)
  • Pronouns: he/him/his
  • Primary areas of interest: 💗💗R💗💗, computational research, achievement gaps, systemic inequities, and variance between educational institutions

3 / 50

whoisyou?

  • Introduce yourself
  • Why are you here?
  • What pronouns would you like us to use for you for this class?
  • What was one thing you did not related to academic work over winter break?
4 / 50

A few class policies

5 / 50

A few class policies

  • Be kind
5 / 50

A few class policies

  • Be kind

  • Be understanding and have patience, with others and yourself

5 / 50

A few class policies

  • Be kind

  • Be understanding and have patience, with others and yourself

  • Help others whenever possible

5 / 50

A few class policies

  • Be kind

  • Be understanding and have patience, with others and yourself

  • Help others whenever possible

Truly the most important part of this class. Important not just in terms of decency, but also in your learning, and most importantly, for equity.

5 / 50

A bit more specific

Normally I would have information here about welcoming kids into class.

Because we're virtual, that part is both easier and harder.

6 / 50

A bit more specific

Normally I would have information here about welcoming kids into class.

Because we're virtual, that part is both easier and harder.

If you need to not attend class, or a portion of class, for any reason, that is fine.

6 / 50

A bit more specific

Normally I would have information here about welcoming kids into class.

Because we're virtual, that part is both easier and harder.

If you need to not attend class, or a portion of class, for any reason, that is fine.

Ideally you would let me know ahead of time. But we're in the middle of a pandemic and life is cray. Please try to contact me beforehand. If this isn't possible, please check in with me after.

6 / 50

Last intro thing

  • I'm here for you

  • We won't have specific office hours, but know I'm always willing to meet

  • This course, like all in the sequence, can be difficult. Don't suffer in silence. Don't do this alone.

7 / 50

Syllabus

8 / 50

Course Website(s)

9 / 50

Materials

  • Nearly everything will be distributed through the repo and through the website.

  • Please clone the repo now, if you haven't already. Pull each week for the most recent changes.

  • We'll use Canvas for grading, and that is essentially it.

10 / 50

R Markdown notes

  • These slides were produced with {xaringan}, an R Markdown variant. I encourage you to try it out and use it for your final project presentation.

  • The website was also produced with R Markdown (sort of)

    • It's a {blogdown} website with some custom CSS and Hugo shortcodes
  • This course is not just about data viz, but also mediums for communication. This includes websites and data dashboards among other possibilities.

11 / 50

My assumptions about you

12 / 50

I assume you

  • Understand the R package ecosystem (how to find, install, load, and learn about them)
13 / 50

I assume you

  • Understand the R package ecosystem (how to find, install, load, and learn about them)

  • Can read "flat" (i.e., rectangular) datasets into R

    • I don't care what you use, but you should be using RStudio Projects & the {here} package
13 / 50
  • Can perform basic data wrangling and transformations in R, using the tidyverse

    • Leverage appropriate functions for introductory data science tasks (pipeline)
    • "clean up" the dataset using scripts and reproducible workflows
14 / 50
  • Can perform basic data wrangling and transformations in R, using the tidyverse

    • Leverage appropriate functions for introductory data science tasks (pipeline)
    • "clean up" the dataset using scripts and reproducible workflows
  • Use version control with R via git and GitHub

14 / 50
  • Can perform basic data wrangling and transformations in R, using the tidyverse

    • Leverage appropriate functions for introductory data science tasks (pipeline)
    • "clean up" the dataset using scripts and reproducible workflows
  • Use version control with R via git and GitHub

  • Use R Markdown to create reproducible dynamic reports

14 / 50

Learning objectives

  • Transform data in a variety of ways to create effective data visualizations
15 / 50

Learning objectives

  • Transform data in a variety of ways to create effective data visualizations

  • Understand and fluently apply different types of data joins

15 / 50

Learning objectives

  • Transform data in a variety of ways to create effective data visualizations

  • Understand and fluently apply different types of data joins

  • Understand best practices in data visualization
15 / 50

Learning objectives

  • Transform data in a variety of ways to create effective data visualizations

  • Understand and fluently apply different types of data joins

  • Understand best practices in data visualization

  • Customize ggplot2 graphics by reordering factors, creating themes, and applying ggthemes

15 / 50

Learning objectives

  • Transform data in a variety of ways to create effective data visualizations

  • Understand and fluently apply different types of data joins

  • Understand best practices in data visualization

  • Customize ggplot2 graphics by reordering factors, creating themes, and applying ggthemes

  • Create an online data visualization portfolio using distill and/or flexdashboards to demonstrate key learning
15 / 50

Examples

Below are some links to final projects from students who took this class last year.

16 / 50

Weekly learning objectives

Provide you a frame for what you should be working to learn for that specific week.

17 / 50

Weekly learning objectives

Provide you a frame for what you should be working to learn for that specific week.

This week's objectives

  • Understand the requirements of the course
  • Understand the requirements of the final project
  • Be ready to go with git and GitHub
17 / 50

Required Textbooks (free)

18 / 50

Other books (also free)

19 / 50

Another resource

See the current draft here. Please read Chapter 8 before next class.

20 / 50

Extra credit opportunity

10 points: Deep dive into a topic not covered by the course

21 / 50

Some options

  • Geographic data (we'll have an intro, but there's a ton here and we won't really do it justice)
  • Network data
  • Text data
  • DAGs
  • Flow data (e.g., alluvial diagrams)
  • Relational data (SQL & friends)
  • Interactive plots
  • Animated plots
22 / 50

Some examples

23 / 50



ggdag via Malcolm Barrett

26 / 50



Patrick Honner via NYT

27 / 50

Labs

See the assignments page of the website.

10 points each (30 points total; 15%)

  1. Distributions & GitHub collabo

  2. Visual perception & plot reproducing

  3. Color

29 / 50

Homework

20 points each (40 points; 20%)

  • Basically the same as the labs, but scored correct/incorrect, and no in-class time devoted to them.

  • Okay to work on collaboratively - I actively encourage you to do so as long as you're using a shared repo

30 / 50

Homework

20 points each (40 points; 20%)

  • Basically the same as the labs, but scored correct/incorrect, and no in-class time devoted to them.

  • Okay to work on collaboratively - I actively encourage you to do so as long as you're using a shared repo

  • Homework 1: Creating new visuals while utilizing different types of joins

  • Homework 2: Visualizing uncertainty, tables, and plot refinement

30 / 50

Quick note on reproducibility

A great blog post by Rafael Irizarry shows how almost any plot you see in popular media can be reproduced in R with ggplot.

31 / 50

Quick note on reproducibility

A great blog post by Rafael Irizarry shows how almost any plot you see in popular media can be reproduced in R with ggplot.

For example

31 / 50

WSJ Version

32 / 50

ggplot reproduction

33 / 50

Data viz "in the wild" presentations

Everyone will present for ~5 minutes - order randomly assigned (coming up next)

  • Find two data viz examples intended for two different audiences

  • Discuss the following

    • What's trying to be communicated
    • How effective do you judge it? Why?
    • At least 1 area of strength
    • At least 1 area for (potential) improvement
34 / 50

Presentation order

I will email this out as well.

Date Presenter
2021-01-11 Kay
2021-01-11 Wanjia
2021-01-13 Joe
2021-01-13 Janette
2021-01-20 Kavya
2021-01-20 Meg
2021-01-25 Anisha
2021-01-25 Rachael
2021-01-27 Zach
2021-01-27 Tess
2021-02-01 Chris
Date Presenter
2021-02-01 Vinita
2021-02-03 Shijing
2021-02-03 David
2021-02-10 Raleigh
2021-02-10 Maggie
2021-02-15 Ann-Marie
2021-02-15 Murat
2021-02-17 Sarah Don
2021-02-22 Hyeonjin
2021-02-24 Anwesha
2021-03-01 Makayla
2021-03-03 Sarah Dim
35 / 50

Final Project

120 points total (60%)

36 / 50

Six parts

  • Proposal (10 points): Due 1/27/21

  • Draft (15 points): Due 2/24/21

  • Peer review (15 points): Assigned, 2/24/21; Due 3/3/21

  • Presentation (20 points): 3/8/21 and 3/10/21 (Week 10)

  • Product (60 points): Due 11:59:59 PM, 3/17/21

37 / 50

Product

Four components:

38 / 50
  • At least three finalized data displays, with each accompanied by a strong narrative/story, as well as the history of how the visualization changed over time.

  • Housed on GitHub

    • Fully reproducible
  • Deployed through GitHub pages (or netlify or similar)

39 / 50

Proposal

Four components:

  • Description of the data source (must be publicly available)

  • Preliminary ideas of different viz

  • Identification of the intended audience for each viz

  • The intended message to be communicated for each viz

40 / 50

Draft

  • Expected to still be a work in progress

    • Data visualizations should be largely complete
  • Deployment not expected

  • Provided to your peers so they can learn from you as much as you can learn from their feedback

41 / 50

Peer Review

  • We are all professionals here. It is imperative we act like it.

  • Understand the purpose of the exercise.

  • Zero tolerance policy for inappropriate comments

  • Should be vigorously encouraging

42 / 50

Peer Review

  • We are all professionals here. It is imperative we act like it.

  • Understand the purpose of the exercise.

  • Zero tolerance policy for inappropriate comments

  • Should be vigorously encouraging

Utilizing GitHub

You'll be assigned three proposals to review (5 points each)

  • Fork their repo, embed comments & suggest changes to their code, submit a PR
42 / 50

Presentation

Order randomly assigned. Basically a chance to share what you created!

  • Discuss what is trying to be communicated

  • Share the final products

  • Discuss the progression along the way and why specific changes were made

43 / 50

Grading

44 / 50

Points

200 points total

  • 3 labs at 10 points each (30 points; 15%)
  • 2 homework assignments at 20 points each (40 points; 20%)
  • five-minute data visualization "in the wild" presentation (10 points; 5%)
  • Final Project (120 points; 60%)
    • Proposal (10 points; 5%)
    • Draft (15 points; 7.5%)
    • Peer review (25 points; 12.5%)
    • Presentation (10 points; 10%)
    • Product (60 points; 30%)
45 / 50

Grading

Lower percent Lower point range Grade Upper point range Upper percent
0.97 (194 pts) A+
0.93 (186 pts) A (194 pts) 0.97
0.90 (180 pts) A- (186 pts) 0.93
0.87 (174 pts) B+ (180 pts) 0.90
0.83 (166 pts) B (174 pts) 0.87
0.80 (160 pts) B- (166 pts) 0.83
0.77 (154 pts) C+ (160 pts) 0.80
0.73 (146 pts) C (154 pts) 0.77
0.70 (140 pts) C- (146 pts) 0.73
F (140 pts) 0.70
46 / 50
47 / 50

Full lecture on Wednesday

My goal: To make you the most prepared cohort with GitHub to date!

48 / 50

Demo

  • The gitkraken GUI
  • Creating a GitHub repo
  • Sharing access (or creating an organization)
  • Cloning the repo
  • stage, commit, push
  • pull
  • branching
  • forking and issues
49 / 50

Next time

Collaborating with GitHub

50 / 50

Agenda

  • Getting on the same page
  • Syllabus
  • A little bit of git/GitHub

2 / 50
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow