In today’s Lab you will gain practice with the following concepts from today’s class:
- Using the
qplot
andggplot
commands from theggplot2
library- Specifying
shape
andcolor
attributes- Using
facet_grid
to create plots that show the data broken down by various subgroups- Constructing geographic heatmaps
We’ll begin by loading all the required packages.
library(tidyverse)
## ── Attaching packages ────────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.2.1 ✔ purrr 0.3.3
## ✔ tibble 2.1.3 ✔ dplyr 0.8.3
## ✔ tidyr 1.0.0 ✔ stringr 1.4.0
## ✔ readr 1.3.1 ✔ forcats 0.4.0
## ── Conflicts ───────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
Cars93 <- as_tibble(MASS::Cars93)
Using the diamonds
data set and the facet_grid
command, create a figure that shows a scatterplot of price
against carat
for each combination of cut
and clarity
.
There are 8 levels of clarity, and 5 levels of cut. Your figure should therefore contain 40 scatterplots.
# Edit me
This problem uses the Cars93 dataset from the MASS package.
(a) Use qplot
to create a scatterplot with Price on the y-axis and EngineSize on the x-axis
.
# Edit me
Describe the relationship between Price and EngineSize.
Replace this text with your solution.
(b) Repeat part (a) using the ggplot
function and geom_point()
layer.
# Edit me
(c) Repeat part (b), but this time specifying that the color
mapping should depend on Type
and the shape
mapping should depend on DriveTrain
.
# Edit me
Do you see any obvious patterns in how the different Types of cars cluster in the plot? Describe any clear patterns that you see.
Replace this text with your solution.
Do you see any obvious patterns in how the different DriveTrains of cars cluster in the plot? Describe any clear patterns that you see.
Replace this text with your solution.
(d) Construct boxplots showing Price on the y-axis and AirBags on the x-axis. (Hint: boxplot
is a valid ggplot2 geometry)
# Edit me
Do you observe any association between AirBag type and Price? Explain.
Replace this text with your solution.
At the end of lecture we used the following code to generate a headmap of murder rates in the US.
library(maps)
##
## Attaching package: 'maps'
## The following object is masked from 'package:purrr':
##
## map
# Create data frame for map data (US states)
states <- map_data("state")
# Here's what the states data frame looks like
str(states)
## 'data.frame': 15537 obs. of 6 variables:
## $ long : num -87.5 -87.5 -87.5 -87.5 -87.6 ...
## $ lat : num 30.4 30.4 30.4 30.3 30.3 ...
## $ group : num 1 1 1 1 1 1 1 1 1 1 ...
## $ order : int 1 2 3 4 5 6 7 8 9 10 ...
## $ region : chr "alabama" "alabama" "alabama" "alabama" ...
## $ subregion: chr NA NA NA NA ...
# Make a copy of the data frame to manipulate
arrests <- USArrests
# Convert everything to lower case
names(arrests) <- tolower(names(arrests))
arrests$region <- tolower(rownames(USArrests))
# Merge the map data with the arrests data based on region
choro <- merge(states, arrests, sort = FALSE, by = "region")
choro <- choro[order(choro$order), ]
# Plot a map, filling in the states based on murder rate
qplot(long, lat, data = choro, group = group, fill = murder,
geom = "polygon") + scale_fill_gradient(low = "#56B1F7", high = "#132B43")
Modify the code above to produce a heatmap of assault
rates instead, with orange colours instead of blue colours for the gradient.
Here’s a document that may help you pick colors: Hex colour picker