+ - 0:00:00
Notes for current slide
Notes for next slide

Geographic Data

A very quick introduction

Daniel Anderson

Week 9, Class 1

1 / 74

Data viz in Wild

Makayla

Sarah Dimakis on deck

2 / 74

First - a disclaimer

  • We're only talking about visualizing geographic data, not analyzing geographic data
3 / 74

First - a disclaimer

  • We're only talking about visualizing geographic data, not analyzing geographic data

  • Even so, there's SO MUCH we won't get to

3 / 74

First - a disclaimer

  • We're only talking about visualizing geographic data, not analyzing geographic data

  • Even so, there's SO MUCH we won't get to

  • Today is an intro - lots more you can do, hopefully you'll feel comfortable with the basics

3 / 74

Learning objectives

  • Know the difference between vector and raster data

  • Be able to produce basic maps

  • Be able to obtain different types of geographic data from a few different places

  • Be able to produce basic interactive maps

  • Understand the basics of the R geospatial ecosystem

4 / 74

Learning objectives

  • Know the difference between vector and raster data

  • Be able to produce basic maps

  • Be able to obtain different types of geographic data from a few different places

  • Be able to produce basic interactive maps

  • Understand the basics of the R geospatial ecosystem

Today is partially about content and partially about exposure

4 / 74

Where to learn more

Geocomputation with R

5 / 74

Zev Ross 2-day Workshop

From rstudio::conf(2020)

Some of this presentation comes from the above.

6 / 74

Vector versus raster data

Image from Zev Ross

7 / 74

Vector data

  • points, lines, and polygons

  • Can easily include non-spatial data (e.g., number of people living within the polygon)

8 / 74

Vector data

  • points, lines, and polygons

  • Can easily include non-spatial data (e.g., number of people living within the polygon)

  • Come in the form of shapefiles (.shp), GeoJSON, or frequently in R packages.
8 / 74

Vector data

  • points, lines, and polygons

  • Can easily include non-spatial data (e.g., number of people living within the polygon)

  • Come in the form of shapefiles (.shp), GeoJSON, or frequently in R packages.

This is what we'll talk about almost exclusively today

Tends to be the most relevant for social science research questions

8 / 74

Raster data

  • Divide the space into a grid

  • Assign each square (pixel) a value

9 / 74

Raster data

  • Divide the space into a grid

  • Assign each square (pixel) a value

Common formats include images and are often used in satellite and remote sensing data.
9 / 74

Raster data

  • Divide the space into a grid

  • Assign each square (pixel) a value

Common formats include images and are often used in satellite and remote sensing data.

Can occasionally be helpful in social science data to show things like population density.

9 / 74

Example

source

10 / 74

Some of the #rspatial ecosystem

11 / 74

Some of the #rspatial ecosystem

My goal

Take you through at least a basic tour of each of these (minus {raster}, although we'll discuss raster data).

11 / 74

Some specific challenges with geospatial data

  • Coordinate reference systems and projections (we won't have much time for this)

  • List columns (specifically when working wtih {sf} objects)

  • Different geometry types (lines, points, polygons)

  • Vector versus raster

  • Data regularly stored in data "cubes" or "bricks" to represent, e.g., longitude, latitude, and elevation, or time series, or different colors

12 / 74

Getting spatial data

  • We'll only cover a few ways to do this
13 / 74

Getting spatial data

  • We'll only cover a few ways to do this
  • Purposefully United States centric
13 / 74

Getting spatial data

  • We'll only cover a few ways to do this
  • Purposefully United States centric

  • Generally reading shape files is not terrifically difficult. Reading in and manipulating raster data can be tricky at times.

13 / 74

Getting spatial data

  • We'll only cover a few ways to do this
  • Purposefully United States centric

  • Generally reading shape files is not terrifically difficult. Reading in and manipulating raster data can be tricky at times.

  • Lots of organizations out there that publish spatial data, and a fair amount are available through R packages

13 / 74

Working with spatial data

Two basic options

  • spatial*DataFrame (from the {sp} package)

  • sf data frame (simple features)

    • We'll mostly talk about this
14 / 74

Working with spatial data

Two basic options

  • spatial*DataFrame (from the {sp} package)

  • sf data frame (simple features)

    • We'll mostly talk about this

I can show you spatial*DataFrame outside the slides (it hung things up here). Generally, I'd stick with {sf}.

Use sf::st_as_sf to convert {sp} to {sf}

14 / 74

{tigris}

library(tigris)
library(sf)
options(tigris_class = "sf")
roads_laneco <- roads("OR", "Lane")
roads_laneco
## Simple feature collection with 20433 features and 4 fields
## geometry type: LINESTRING
## dimension: XY
## bbox: xmin: -124.1536 ymin: 43.4376 xmax: -121.8078 ymax: 44.29001
## geographic CRS: NAD83
## # A tibble: 20,433 x 5
## LINEARID FULLNAME RTTYP MTFCC geometry
## <chr> <chr> <chr> <chr> <LINESTRING [°]>
## 1 1102152610459 W Lone Oak Lp M S1640 (-123.1256 44.10108, -123.1262 …
## 2 1102217699747 Sheldon Village Lp M S1640 (-123.0742 44.07891, -123.073 4…
## 3 110458663505 Cottage Hts Lp M S1400 (-123.0522 43.7893, -123.0522 4…
## 4 1102152615811 Village Plz Lp M S1640 (-123.1051 44.08716, -123.1058 …
## 5 110458661289 River Pointe Lp M S1400 (-123.0864 44.10306, -123.0852 …
## 6 1102217699746 Sheldon Village Lp M S1640 (-123.0723 44.07875, -123.0723 …
## 7 110458664549 Village Plz Lp M S1400 (-123.1053 44.08658, -123.1052 …
## 8 1102223141058 Carpenter Byp M S1400 (-123.3368 43.78013, -123.3367 …
## 9 1106092829172 State Hwy 126 Bus S S1200 (-123.0502 44.04391, -123.0502 …
## 10 1104493105112 State Hwy 126 Bus S S1200 (-122.996 44.04587, -122.9959 4…
## # … with 20,423 more rows
15 / 74

I/O

Let's say I want to write the file to disk.

# from the sf library
write_sf(roads_laneco, here::here("data", "roads_lane.shp"))
16 / 74

I/O

Let's say I want to write the file to disk.

# from the sf library
write_sf(roads_laneco, here::here("data", "roads_lane.shp"))

Then read it in later

roads_laneco <- read_sf(here::here("data", "roads_lane.shp"))
roads_laneco
## Simple feature collection with 20433 features and 4 fields
## geometry type: LINESTRING
## dimension: XY
## bbox: xmin: -124.1536 ymin: 43.4376 xmax: -121.8078 ymax: 44.29001
## geographic CRS: NAD83
## # A tibble: 20,433 x 5
## LINEARID FULLNAME RTTYP MTFCC geometry
## <chr> <chr> <chr> <chr> <LINESTRING [°]>
## 1 1102152610459 W Lone Oak Lp M S1640 (-123.1256 44.10108, -123.1262 …
## 2 1102217699747 Sheldon Village Lp M S1640 (-123.0742 44.07891, -123.073 4…
## 3 110458663505 Cottage Hts Lp M S1400 (-123.0522 43.7893, -123.0522 4…
## 4 1102152615811 Village Plz Lp M S1640 (-123.1051 44.08716, -123.1058 …
## 5 110458661289 River Pointe Lp M S1400 (-123.0864 44.10306, -123.0852 …
## 6 1102217699746 Sheldon Village Lp M S1640 (-123.0723 44.07875, -123.0723 …
## 7 110458664549 Village Plz Lp M S1400 (-123.1053 44.08658, -123.1052 …
## 8 1102223141058 Carpenter Byp M S1400 (-123.3368 43.78013, -123.3367 …
## 9 1106092829172 State Hwy 126 Bus S S1200 (-123.0502 44.04391, -123.0502 …
## 10 1104493105112 State Hwy 126 Bus S S1200 (-122.996 44.04587, -122.9959 4…
## # … with 20,423 more rows
16 / 74

{sf} works with ggplot

Use ggplot2::geom_sf

ggplot(roads_laneco) +
geom_sf(color = "gray60")

17 / 74

Add water features

lakes <- area_water("OR", "Lane")
streams <- linear_water("OR", "Lane")
ggplot() +
geom_sf(data = lakes, fill = "#518FB5") + # Add lakes
geom_sf(data = streams, color = "#518FB5") + # Add streams/drainage
geom_sf(data = roads_laneco, color = "gray60") # add roads

Note - these functions are all from the {tigris} package.

18 / 74

19 / 74

Quick aside

Similar package osmdata

  • Specifically for street-level data.
  • We'll just use the boundry box functionality, but you can add many of the same things (and there are other packages that will provide you with boundary boxes)
bb <- osmdata::getbb("Eugene")
bb
## min max
## x -123.20876 -123.03589
## y 43.98753 44.13227
20 / 74
ggplot() +
geom_sf(data = lakes, fill = "#518FB5") + # Add lakes
geom_sf(data = streams, color = "#518FB5", size = 1.2) + # Add streams
geom_sf(data = roads_laneco, color = "gray60") + # add roads
coord_sf(xlim = bb[1, ], ylim = bb[2, ]) # limit range
21 / 74

22 / 74

Quickly

Same thing but fully osmdata

library(osmdata)
library(colorspace)
bb <- getbb("Eugene")
roads <- bb %>%
opq() %>% #overpass query
add_osm_feature("highway") %>% # feature to add
osmdata_sf() # Change it to sf
water <- bb %>%
opq() %>%
add_osm_feature("water") %>%
osmdata_sf()
23 / 74

Use the data to plot

ggplot() +
geom_sf(data = water$osm_multipolygons,
fill = "#518FB5",
color = darken("#518FB5")) +
geom_sf(data = water$osm_polygons,
fill = "#518FB5",
color = darken("#518FB5")) +
geom_sf(data = water$osm_lines,
color = darken("#518FB5")) +
geom_sf(data = roads$osm_lines,
color = "gray40",
size = 0.2) +
coord_sf(xlim = bb[1, ],
ylim = bb[2, ],
expand = FALSE) +
labs(caption = "Eugene, OR")
24 / 74

25 / 74

Let's get some census data

Note

To do this, you need to first register an API key with the US Census, which you can do here. Then use census_api_key("YOUR API KEY").

Alternatively, you can specify CENSUS_API_KEY = "YOUR API KEY" in .Renviron. You can do this by using usethis::edit_r_environ()

26 / 74

Getting the data

library(tidycensus)
# Find variable code
# v <- load_variables(2018, "acs5")
# View(v)
census_vals <- get_acs(geography = "tract",
state = "OR",
variables = c(med_income = "B06011_001",
ed_attain = "B15003_001"),
year = 2018,
geometry = TRUE)
##
|
| | 0%
|
| | 1%
|
|= | 1%
|
|= | 2%
|
|== | 2%
|
|== | 3%
|
|=== | 4%
|
|=== | 5%
|
|==== | 5%
|
|==== | 6%
|
|===== | 6%
|
|===== | 7%
|
|===== | 8%
|
|====== | 8%
|
|====== | 9%
|
|======= | 9%
|
|======= | 10%
|
|======== | 11%
|
|======== | 12%
|
|========= | 13%
|
|========== | 14%
|
|========== | 15%
|
|=========== | 16%
|
|============ | 17%
|
|============ | 18%
|
|============= | 19%
|
|============== | 20%
|
|======================== | 34%
|
|========================= | 35%
|
|========================== | 37%
|
|=========================== | 38%
|
|============================ | 40%
|
|============================= | 41%
|
|============================== | 43%
|
|=============================== | 45%
|
|================================ | 46%
|
|================================= | 47%
|
|================================= | 48%
|
|================================== | 49%
|
|==================================== | 51%
|
|===================================== | 52%
|
|====================================== | 54%
|
|======================================= | 55%
|
|======================================== | 57%
|
|================================================ | 68%
|
|================================================= | 69%
|
|================================================= | 70%
|
|================================================== | 72%
|
|=================================================== | 73%
|
|==================================================== | 74%
|
|===================================================== | 76%
|
|====================================================== | 77%
|
|======================================================= | 79%
|
|======================================================== | 80%
|
|========================================================= | 82%
|
|=========================================================== | 84%
|
|============================================================ | 86%
|
|============================================================= | 87%
|
|============================================================== | 89%
|
|=============================================================== | 91%
|
|================================================================ | 92%
|
|================================================================== | 94%
|
|=================================================================== | 96%
|
|======================================================================| 100%
27 / 74

Look at the data

census_vals
## Simple feature collection with 1668 features and 5 fields (with 12 geometries empty)
## geometry type: MULTIPOLYGON
## dimension: XY
## bbox: xmin: -124.5662 ymin: 41.99179 xmax: -116.4635 ymax: 46.29204
## geographic CRS: NAD83
## First 10 features:
## GEOID NAME variable estimate
## 1 41045970600 Census Tract 9706, Malheur County, Oregon med_income 23670
## 2 41045970600 Census Tract 9706, Malheur County, Oregon ed_attain 2714
## 3 41047001200 Census Tract 12, Marion County, Oregon med_income 27106
## 4 41047001200 Census Tract 12, Marion County, Oregon ed_attain 2695
## 5 41047002000 Census Tract 20, Marion County, Oregon med_income 33271
## 6 41047002000 Census Tract 20, Marion County, Oregon ed_attain 7260
## 7 41047010501 Census Tract 105.01, Marion County, Oregon med_income 31291
## 8 41047010501 Census Tract 105.01, Marion County, Oregon ed_attain 4099
## 9 41051000301 Census Tract 3.01, Multnomah County, Oregon med_income 22531
## 10 41051000301 Census Tract 3.01, Multnomah County, Oregon ed_attain 3550
## moe geometry
## 1 3812 MULTIPOLYGON (((-117.4678 4...
## 2 211 MULTIPOLYGON (((-117.4678 4...
## 3 5112 MULTIPOLYGON (((-123.0447 4...
## 4 213 MULTIPOLYGON (((-123.0447 4...
## 5 3565 MULTIPOLYGON (((-123.0362 4...
## 6 477 MULTIPOLYGON (((-123.0362 4...
## 7 5080 MULTIPOLYGON (((-122.7894 4...
## 8 363 MULTIPOLYGON (((-122.7894 4...
## 9 3680 MULTIPOLYGON (((-122.6435 4...
## 10 436 MULTIPOLYGON (((-122.6435 4...
28 / 74

Plot it

library(colorspace)
ggplot(census_vals) +
geom_sf(aes(fill = estimate, color = estimate)) +
facet_wrap(~variable) +
guides(color = "none") +
scale_fill_continuous_diverging("Blue-Red 3",
rev = TRUE) +
scale_color_continuous_diverging("Blue-Red 3",
rev = TRUE)
29 / 74

hmm...

30 / 74

Try again

library(colorspace)
income <- filter(census_vals, variable == "med_income")
income_plot <- ggplot(income) +
geom_sf(aes(fill = estimate, color = estimate)) +
facet_wrap(~variable) +
guides(color = "none") +
scale_fill_continuous_diverging(
"Blue-Red 3",
rev = TRUE,
mid = mean(income$estimate, na.rm = TRUE)
) +
scale_color_continuous_diverging(
"Blue-Red 3",
rev = TRUE,
mid = mean(income$estimate, na.rm = TRUE)
) +
theme(legend.position = "bottom",
legend.key.width = unit(2, "cm"))
31 / 74
income_plot

32 / 74

Same thing for education

ed <- filter(census_vals, variable == "ed_attain")
ed_plot <- ggplot(ed) +
geom_sf(aes(fill = estimate, color = estimate)) +
facet_wrap(~variable) +
guides(color = "none") +
scale_fill_continuous_diverging(
"Blue-Red 3",
rev = TRUE,
mid = mean(ed$estimate, na.rm = TRUE)
) +
scale_color_continuous_diverging(
"Blue-Red 3",
rev = TRUE,
mid = mean(ed$estimate, na.rm = TRUE)
) +
theme(legend.position = "bottom",
legend.key.width = unit(2, "cm"))
33 / 74
ed_plot

34 / 74

Put them together

gridExtra::grid.arrange(income_plot, ed_plot, ncol = 2)

35 / 74

Bivariate color scales

36 / 74
37 / 74

How?

There are a few different ways. Here's one:

  • Break continuous variable into categorical values

  • Assign each combination of values between categorical vars a color

  • Make sure the combinations of the colors make sense

38 / 74

How?

There are a few different ways. Here's one:

  • Break continuous variable into categorical values

  • Assign each combination of values between categorical vars a color

  • Make sure the combinations of the colors make sense

gif from Joshua Stevens

38 / 74

Do it

Note - this will be fairly quick. I'm not expecting you to know how to do this, but I want to show you the idea and give you the breadcrumbs for the code you may need.

39 / 74

Do it

Note - this will be fairly quick. I'm not expecting you to know how to do this, but I want to show you the idea and give you the breadcrumbs for the code you may need.

First - move it to wider

wider <- census_vals %>%
select(-moe) %>%
spread(variable, estimate) %>% # pivot_wider doesn't work w/sf yet
drop_na(ed_attain, med_income)
wider
## Simple feature collection with 825 features and 4 fields
## geometry type: MULTIPOLYGON
## dimension: XY
## bbox: xmin: -124.5662 ymin: 41.99179 xmax: -116.4635 ymax: 46.29204
## geographic CRS: NAD83
## First 10 features:
## GEOID NAME ed_attain med_income
## 1 41001950100 Census Tract 9501, Baker County, Oregon 2228 24846
## 2 41001950200 Census Tract 9502, Baker County, Oregon 2374 23288
## 3 41001950300 Census Tract 9503, Baker County, Oregon 1694 24080
## 4 41001950400 Census Tract 9504, Baker County, Oregon 2059 24083
## 5 41001950500 Census Tract 9505, Baker County, Oregon 1948 26207
## 6 41001950600 Census Tract 9506, Baker County, Oregon 1604 23381
## 7 41003000100 Census Tract 1, Benton County, Oregon 5509 26188
## 8 41003000202 Census Tract 2.02, Benton County, Oregon 3630 35343
## 9 41003000400 Census Tract 4, Benton County, Oregon 5562 40869
## 10 41003000500 Census Tract 5, Benton County, Oregon 2526 38922
## geometry
## 1 MULTIPOLYGON (((-118.5194 4...
## 2 MULTIPOLYGON (((-117.9158 4...
## 3 MULTIPOLYGON (((-117.9506 4...
## 4 MULTIPOLYGON (((-117.8309 4...
## 5 MULTIPOLYGON (((-117.9774 4...
## 6 MULTIPOLYGON (((-117.7775 4...
## 7 MULTIPOLYGON (((-123.2812 4...
## 8 MULTIPOLYGON (((-123.3415 4...
## 9 MULTIPOLYGON (((-123.3039 4...
## 10 MULTIPOLYGON (((-123.2979 4...
39 / 74

Find the quartiles

ed_quartiles <- quantile(wider$ed_attain,
probs = seq(0, 1, length.out = 4))
inc_quartiles <- quantile(wider$med_income,
probs = seq(0, 1, length.out = 4))
ed_quartiles
## 0% 33.33333% 66.66667% 100%
## 54.000 2675.333 3894.667 10039.000
40 / 74

Create the cut variable

wider <- wider %>%
mutate(cat_ed = cut(ed_attain, ed_quartiles),
cat_inc = cut(med_income, inc_quartiles))
wider %>%
select(starts_with("cat"))
## Simple feature collection with 825 features and 2 fields
## geometry type: MULTIPOLYGON
## dimension: XY
## bbox: xmin: -124.5662 ymin: 41.99179 xmax: -116.4635 ymax: 46.29204
## geographic CRS: NAD83
## First 10 features:
## cat_ed cat_inc geometry
## 1 (54,2.68e+03] (7.46e+03,2.56e+04] MULTIPOLYGON (((-118.5194 4...
## 2 (54,2.68e+03] (7.46e+03,2.56e+04] MULTIPOLYGON (((-117.9158 4...
## 3 (54,2.68e+03] (7.46e+03,2.56e+04] MULTIPOLYGON (((-117.9506 4...
## 4 (54,2.68e+03] (7.46e+03,2.56e+04] MULTIPOLYGON (((-117.8309 4...
## 5 (54,2.68e+03] (2.56e+04,3.24e+04] MULTIPOLYGON (((-117.9774 4...
## 6 (54,2.68e+03] (7.46e+03,2.56e+04] MULTIPOLYGON (((-117.7775 4...
## 7 (3.89e+03,1e+04] (2.56e+04,3.24e+04] MULTIPOLYGON (((-123.2812 4...
## 8 (2.68e+03,3.89e+03] (3.24e+04,7.86e+04] MULTIPOLYGON (((-123.3415 4...
## 9 (3.89e+03,1e+04] (3.24e+04,7.86e+04] MULTIPOLYGON (((-123.3039 4...
## 10 (54,2.68e+03] (3.24e+04,7.86e+04] MULTIPOLYGON (((-123.2979 4...
41 / 74

Set palette

# First drop geo column
pal <- st_drop_geometry(wider) %>%
count(cat_ed, cat_inc) %>%
arrange(cat_ed, cat_inc) %>%
drop_na(cat_ed, cat_inc) %>%
mutate(pal = c("#F3F3F3", "#C3F1D5", "#8BE3AF",
"#EBC5DD", "#C3C5D5", "#8BC5AF",
"#E7A3D1", "#C3A3D1", "#8BA3AE"))
pal
## cat_ed cat_inc n pal
## 1 (54,2.68e+03] (7.46e+03,2.56e+04] 113 #F3F3F3
## 2 (54,2.68e+03] (2.56e+04,3.24e+04] 87 #C3F1D5
## 3 (54,2.68e+03] (3.24e+04,7.86e+04] 73 #8BE3AF
## 4 (2.68e+03,3.89e+03] (7.46e+03,2.56e+04] 85 #EBC5DD
## 5 (2.68e+03,3.89e+03] (2.56e+04,3.24e+04] 97 #C3C5D5
## 6 (2.68e+03,3.89e+03] (3.24e+04,7.86e+04] 93 #8BC5AF
## 7 (3.89e+03,1e+04] (7.46e+03,2.56e+04] 75 #E7A3D1
## 8 (3.89e+03,1e+04] (2.56e+04,3.24e+04] 91 #C3A3D1
## 9 (3.89e+03,1e+04] (3.24e+04,7.86e+04] 109 #8BA3AE
42 / 74

Join & plot

bivar_map <- left_join(wider, pal) %>%
ggplot() +
geom_sf(aes(fill = pal, color = pal)) +
guides(fill = "none", color = "none") +
scale_fill_identity() +
scale_color_identity()
43 / 74

44 / 74

Add in legend

First create it

leg <- ggplot(pal, aes(cat_ed, cat_inc)) +
geom_tile(aes(fill = pal)) +
scale_fill_identity() +
coord_fixed() +
labs(x = expression("Higher education" %->% ""),
y = expression("Higher income" %->% "")) +
theme(axis.text = element_blank(),
axis.title = element_text(size = 12))
leg

45 / 74

Put together

library(cowplot)
ggdraw() +
draw_plot(bivar_map + theme_void(), 0.1, 0.1, 1, 1) +
draw_plot(leg, -0.05, 0, 0.3, 0.3)

Coordinates are mostly guess/check depending on aspect ratio

46 / 74

47 / 74

{tmap}

Back to just one variable

I mostly use ggplot(), but the {tmap} package is really powerful and the syntax is pretty straightforward, so let's have a quick overview.

48 / 74

Education map with {tmap}.

library(tmap)
tm_shape(wider) +
tm_polygons("med_income")

49 / 74

Facet

tm_shape(census_vals) +
tm_polygons("estimate") +
tm_facets("variable")

50 / 74

Change colors

tm_shape(wider) +
tm_polygons("ed_attain",
palette = "magma",
border.col = "gray90",
lwd = 0.1)

51 / 74

Continuous legend

tm_shape(wider) +
tm_polygons("ed_attain",
style = "cont") +
tm_layout(legend.outside = TRUE)

52 / 74

Add text

  • First, let's get data at the county level, instead of census tract level
cnty <- get_acs(geography = "county",
state = "OR",
variables = c(ed_attain = "B15003_001"),
year = 2018,
geometry = TRUE)
##
|
| | 0%
|
| | 1%
|
|= | 1%
|
|= | 2%
|
|== | 2%
|
|== | 3%
|
|=== | 4%
|
|=== | 5%
|
|==== | 5%
|
|==== | 6%
|
|===== | 7%
|
|====== | 8%
|
|====== | 9%
|
|======= | 9%
|
|======== | 11%
|
|======== | 12%
|
|========= | 13%
|
|========= | 14%
|
|========== | 14%
|
|========== | 15%
|
|=========== | 15%
|
|=========== | 16%
|
|============ | 17%
|
|============ | 18%
|
|============= | 18%
|
|============= | 19%
|
|=============== | 22%
|
|================ | 22%
|
|================ | 23%
|
|================= | 24%
|
|================= | 25%
|
|================== | 25%
|
|================== | 26%
|
|=================== | 26%
|
|=================== | 27%
|
|=================== | 28%
|
|==================== | 28%
|
|==================== | 29%
|
|===================== | 29%
|
|===================== | 30%
|
|===================== | 31%
|
|====================== | 31%
|
|====================== | 32%
|
|======================= | 32%
|
|======================== | 35%
|
|========================= | 35%
|
|========================= | 36%
|
|========================== | 37%
|
|========================== | 38%
|
|=========================== | 38%
|
|=========================== | 39%
|
|============================= | 41%
|
|============================= | 42%
|
|============================== | 42%
|
|============================== | 43%
|
|=============================== | 44%
|
|=============================== | 45%
|
|================================ | 45%
|
|================================ | 46%
|
|================================= | 47%
|
|================================= | 48%
|
|================================== | 48%
|
|=================================== | 51%
|
|==================================== | 51%
|
|==================================== | 52%
|
|===================================== | 52%
|
|===================================== | 53%
|
|======================================= | 56%
|
|======================================== | 56%
|
|======================================== | 57%
|
|======================================== | 58%
|
|========================================= | 58%
|
|========================================= | 59%
|
|========================================== | 59%
|
|========================================== | 60%
|
|============================================ | 63%
|
|============================================= | 64%
|
|============================================= | 65%
|
|============================================== | 65%
|
|============================================== | 66%
|
|=============================================== | 67%
|
|=============================================== | 68%
|
|================================================ | 68%
|
|================================================ | 69%
|
|================================================= | 69%
|
|================================================= | 70%
|
|================================================= | 71%
|
|================================================== | 71%
|
|================================================== | 72%
|
|=================================================== | 72%
|
|=================================================== | 73%
|
|==================================================== | 74%
|
|==================================================== | 75%
|
|====================================================== | 78%
|
|======================================================= | 78%
|
|======================================================= | 79%
|
|======================================================== | 79%
|
|======================================================== | 80%
|
|======================================================== | 81%
|
|========================================================= | 81%
|
|=========================================================== | 85%
|
|============================================================ | 85%
|
|============================================================ | 86%
|
|============================================================= | 87%
|
|============================================================= | 88%
|
|============================================================== | 88%
|
|============================================================== | 89%
|
|=============================================================== | 90%
|
|=============================================================== | 91%
|
|================================================================ | 91%
|
|================================================================ | 92%
|
|================================================================= | 92%
|
|================================================================= | 93%
|
|================================================================= | 94%
|
|================================================================== | 94%
|
|================================================================== | 95%
|
|=================================================================== | 95%
|
|=================================================================== | 96%
|
|==================================================================== | 96%
|
|===================================================================== | 99%
|
|======================================================================| 99%
|
|======================================================================| 100%
53 / 74
cnty
## Simple feature collection with 36 features and 5 fields
## geometry type: MULTIPOLYGON
## dimension: XY
## bbox: xmin: -124.5662 ymin: 41.99179 xmax: -116.4635 ymax: 46.29204
## geographic CRS: NAD83
## First 10 features:
## GEOID NAME variable estimate moe
## 1 41005 Clackamas County, Oregon ed_attain 285481 121
## 2 41021 Gilliam County, Oregon ed_attain 1448 97
## 3 41033 Josephine County, Oregon ed_attain 63006 172
## 4 41035 Klamath County, Oregon ed_attain 46345 108
## 5 41039 Lane County, Oregon ed_attain 251966 137
## 6 41043 Linn County, Oregon ed_attain 85098 147
## 7 41051 Multnomah County, Oregon ed_attain 579186 133
## 8 41055 Sherman County, Oregon ed_attain 1238 73
## 9 41061 Union County, Oregon ed_attain 17317 115
## 10 41007 Clatsop County, Oregon ed_attain 27935 125
## geometry
## 1 MULTIPOLYGON (((-122.8679 4...
## 2 MULTIPOLYGON (((-120.6535 4...
## 3 MULTIPOLYGON (((-124.042 42...
## 4 MULTIPOLYGON (((-122.29 42....
## 5 MULTIPOLYGON (((-124.1503 4...
## 6 MULTIPOLYGON (((-123.2608 4...
## 7 MULTIPOLYGON (((-122.9292 4...
## 8 MULTIPOLYGON (((-121.0312 4...
## 9 MULTIPOLYGON (((-118.6978 4...
## 10 MULTIPOLYGON (((-123.5989 4...
54 / 74

Estimate polygon centroid

centroids <- st_centroid(cnty)
55 / 74

Estimate polygon centroid

centroids <- st_centroid(cnty)

Extract just county name

centroids <- centroids %>%
mutate(county = str_replace_all(NAME, " County, Oregon", ""))
55 / 74

Plot

tm_shape(cnty) +
tm_polygons("estimate",
style = "cont") +
tm_shape(centroids) +
tm_text("county", size = 0.5) +
tm_layout(legend.outside = TRUE)

56 / 74

Add raster elevation data

states <- get_acs("state",
variables = c(ed_attain = "B15003_001"),
year = 2018,
geometry = TRUE)
or <- filter(states, NAME == "Oregon")
# convert to spatial data frame
sp <- as(or, "Spatial")
# use elevatr library to pull data
library(elevatr)
or_elev <- get_elev_raster(sp, z = 9)
lane_elev <- get_elev_raster(sp, z = 9)
57 / 74

Plot

tm_shape(or_elev) +
tm_raster(midpoint = NA,
style = "cont") +
tm_layout(legend.outside = TRUE) +
tm_shape(cnty) +
tm_borders(col = "gray60")

58 / 74

Add custom palette

tm_shape(or_elev) +
tm_raster(midpoint = NA,
style = "cont",
palette = c("#E2FCFF", "#83A9CE", "#485C6E",
"#181818", "#5C5B3E", "#AAA971",
"#FCFCD3", "#ffffff")) +
tm_layout(legend.outside = TRUE) +
tm_shape(cnty) +
tm_borders(col = "gray60")
59 / 74

60 / 74

You can do some amazing things!

61 / 74

Create interactive maps

Just change run tmap_mode("view) then run the same code as before

tmap_mode("view")
tm_shape(cnty) +
tm_polygons("estimate") +
tm_shape(centroids) +
tm_text("county", size = 0.5)
62 / 74
Clackamas
Gilliam
Josephine
Klamath
Lane
Linn
Multnomah
Sherman
Union
Clatsop
Deschutes
Douglas
Wasco
Grant
Curry
Jefferson
Marion
Benton
Harney
Coos
Hood River
Morrow
Umatilla
Baker
Malheur
Columbia
Tillamook
Lake
Jackson
Lincoln
Polk
Wallowa
Yamhill
Crook
Wheeler
Washington
estimate
0 to 100,000
100,000 to 200,000
200,000 to 300,000
300,000 to 400,000
400,000 to 500,000
500,000 to 600,000
Leaflet | Tiles © Esri — Esri, DeLorme, NAVTEQ
63 / 74

mapview

  • Really quick easy interactive maps
library(mapview)
mapview(cnty)
cnty
100 km
100 mi
Leaflet | © OpenStreetMap contributors © CARTO
64 / 74
mapview(cnty, zcol = "estimate")
cnty - estimate
100,000200,000300,000400,000500,000

100 km
100 mi
Leaflet | © OpenStreetMap contributors © CARTO
65 / 74
mapview(cnty,
zcol = "estimate",
popup = leafpop::popupTable(cnty,
zcol = c("NAME", "estimate")))
cnty - estimate
100,000200,000300,000400,000500,000

100 km
100 mi
Leaflet | © OpenStreetMap contributors © CARTO
66 / 74

A few other things of note

67 / 74

statebins

library(statebins)
statebins(states,
state_col = "NAME",
value_col = "estimate") +
theme_void()

68 / 74

Cartograms

library(cartogram)
or_county_pop <- get_acs(geography = "county",
state = "OR",
variables = "B00001_001",
year = 2018,
geometry = TRUE)
# Set projection
or_county_pop <- st_transform(or_county_pop,
crs = 2992)
# found the CRS here: https://www.oregon.gov/geo/pages/projections.aspx
carto_counties <- cartogram_cont(or_county_pop, "estimate")
69 / 74

Compare

ggplot(or_county_pop) +
geom_sf(fill = "#BCD8EB")

ggplot(carto_counties) +
geom_sf(fill = "#D5FFFA")

70 / 74

State

state_pop <- get_acs(geography = "state",
variables = "B00001_001",
year = 2018,
geometry = TRUE)
# Set projection
state_pop <- st_transform(state_pop, crs = 2163)
# found the CRS here: https://epsg.io/transform#s_srs=3969&t_srs=4326
carto_states <- cartogram_cont(state_pop, "estimate")
71 / 74

Cartogram of USA by population

ggplot(carto_states) +
geom_sf()

72 / 74

Last note

You may or may not like cartograms.

73 / 74

Last note

You may or may not like cartograms. Just be careful not to lie with maps.

73 / 74

Next time

Customizing web pages with CSS

Last actual lecture for this class

74 / 74

Data viz in Wild

Makayla

Sarah Dimakis on deck

2 / 74
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow