Remember to change the author: field on this Rmd file to your own name.

Learning objectives

In today’s Lab you will gain practice with the following concepts from today’s class:

  • Using the qplot and ggplot commands from the ggplot2 library
  • Specifying shape and color attributes
  • Using facet_grid to create plots that show the data broken down by various subgroups
  • Constructing geographic heatmaps

Problems

We’ll begin by loading all the required packages.

library(tidyverse)
## ── Attaching packages ────────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.2.1     ✔ purrr   0.3.3
## ✔ tibble  2.1.3     ✔ dplyr   0.8.3
## ✔ tidyr   1.0.0     ✔ stringr 1.4.0
## ✔ readr   1.3.1     ✔ forcats 0.4.0
## ── Conflicts ───────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
Cars93 <- as_tibble(MASS::Cars93)

1. facet_grid

Using the diamonds data set and the facet_grid command, create a figure that shows a scatterplot of price against carat for each combination of cut and clarity.

There are 8 levels of clarity, and 5 levels of cut. Your figure should therefore contain 40 scatterplots.

# Edit me

2. Plotting the Cars93 data

This problem uses the Cars93 dataset from the MASS package.

(a) Use qplot to create a scatterplot with Price on the y-axis and EngineSize on the x-axis.

# Edit me

Describe the relationship between Price and EngineSize.

Replace this text with your solution.

(b) Repeat part (a) using the ggplot function and geom_point() layer.

# Edit me

(c) Repeat part (b), but this time specifying that the color mapping should depend on Type and the shape mapping should depend on DriveTrain.

# Edit me

Do you see any obvious patterns in how the different Types of cars cluster in the plot? Describe any clear patterns that you see.

Replace this text with your solution.

Do you see any obvious patterns in how the different DriveTrains of cars cluster in the plot? Describe any clear patterns that you see.

Replace this text with your solution.

(d) Construct boxplots showing Price on the y-axis and AirBags on the x-axis. (Hint: boxplot is a valid ggplot2 geometry)

# Edit me

Do you observe any association between AirBag type and Price? Explain.

Replace this text with your solution.

3. Plotting a map

At the end of lecture we used the following code to generate a headmap of murder rates in the US.

library(maps)
## 
## Attaching package: 'maps'
## The following object is masked from 'package:purrr':
## 
##     map
# Create data frame for map data (US states)
states <- map_data("state")

# Here's what the states data frame looks like
str(states)
## 'data.frame':    15537 obs. of  6 variables:
##  $ long     : num  -87.5 -87.5 -87.5 -87.5 -87.6 ...
##  $ lat      : num  30.4 30.4 30.4 30.3 30.3 ...
##  $ group    : num  1 1 1 1 1 1 1 1 1 1 ...
##  $ order    : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ region   : chr  "alabama" "alabama" "alabama" "alabama" ...
##  $ subregion: chr  NA NA NA NA ...
# Make a copy of the data frame to manipulate
arrests <- USArrests

# Convert everything to lower case
names(arrests) <- tolower(names(arrests))
arrests$region <- tolower(rownames(USArrests))

# Merge the map data with the arrests data based on region
choro <- merge(states, arrests, sort = FALSE, by = "region")
choro <- choro[order(choro$order), ]

# Plot a map, filling in the states based on murder rate
qplot(long, lat, data = choro, group = group, fill = murder,
  geom = "polygon") + scale_fill_gradient(low = "#56B1F7", high = "#132B43")

Modify the code above to produce a heatmap of assault rates instead, with orange colours instead of blue colours for the gradient.

Here’s a document that may help you pick colors: Hex colour picker