--- title: "Lab 6" author: "Your Name Here" date: "" output: html_document --- ##### Remember to change the `author: ` field on this Rmd file to your own name. ### Learning objectives > In today's Lab you will gain practice with the following concepts from today's class: >- Using the `qplot` and `ggplot` commands from the `ggplot2` library - Specifying `shape` and `color` attributes - Using `facet_grid` to create plots that show the data broken down by various subgroups - Constructing geographic heatmaps ### Problems We'll begin by loading all the required packages. ```{r} library(tidyverse) Cars93 <- as_tibble(MASS::Cars93) ``` #### 1. facet_grid Using the `diamonds` data set and the `facet_grid` command, create a figure that shows a scatterplot of `price` against `carat` for each combination of `cut` and `clarity`. There are `r length(levels(diamonds$clarity))` levels of clarity, and `r length(levels(diamonds$cut))` levels of cut. Your figure should therefore contain `r length(levels(diamonds$clarity)) * length(levels(diamonds$cut))` scatterplots. ```{r fig.width=10, fig.height=10, dpi=70} ggplot(diamonds, aes(x = carat, y = price)) + geom_point() + facet_grid(cut ~ clarity) ``` #### 2. Plotting the Cars93 data This problem uses the Cars93 dataset from the MASS package. **(a)** Use `qplot` to create a scatterplot with Price on the y-axis and EngineSize on the `x-axis`. ```{r, fig.align='center', fig.height=4, fig.width=5} qplot(x = EngineSize, y = Price, data = Cars93) ``` **Describe the relationship between Price and EngineSize.** > Price tends to increase as engine size increases. The variability in prices also seems to increase with engine size. **(b)** Repeat part (a) using the `ggplot` function and `geom_point()` layer. ```{r, fig.align='center', fig.height=4, fig.width=5} ggplot(Cars93, aes(x = EngineSize, y = Price)) + geom_point() ``` **(c)** Repeat part (b), but this time specifying that the `color` mapping should depend on `Type` and the `shape` mapping should depend on `DriveTrain`. ```{r, fig.align='center', fig.height=4, fig.width=5} ggplot(Cars93, aes(x = EngineSize, y = Price, colour = Type, shape = DriveTrain)) + geom_point() ``` **Do you see any obvious patterns in how the different Types of cars cluster in the plot? Describe any clear patterns that you see.** > Car types do appear to cluster. For instance, small cars tend to have small engines and low prices. Large cars ten to have mid-to-large sized engines and moderate prices. **Do you see any obvious patterns in how the different DriveTrains of cars cluster in the plot? Describe any clear patterns that you see.** > This is less easy to discern. There certainly isn’t as much clustering as there is by car Type. Below we switch the role of Type and DriveTrain to make it easier to see patterns. The points are quite diffuse within each drivetrain type. **(d)** Construct boxplots showing Price on the y-axis and AirBags on the x-axis. (Hint: `boxplot` is a valid ggplot2 geometry) ```{r} qplot(data = Cars93, x = AirBags, y = Price, geom = "boxplot") ``` **Do you observe any association between AirBag type and Price? Explain.** > The more airbags a vehicle has, the greater the price tends to be. #### 3. Plotting a map At the end of lecture we used the following code to generate a headmap of murder rates in the US. ```{r, fig.width = 7, fig.height = 4, fig.align='center'} library(maps) # Create data frame for map data (US states) states <- map_data("state") # Here's what the states data frame looks like str(states) # Make a copy of the data frame to manipulate arrests <- USArrests # Convert everything to lower case names(arrests) <- tolower(names(arrests)) arrests$region <- tolower(rownames(USArrests)) # Merge the map data with the arrests data based on region choro <- merge(states, arrests, sort = FALSE, by = "region") choro <- choro[order(choro$order), ] # Plot a map, filling in the states based on murder rate qplot(long, lat, data = choro, group = group, fill = murder, geom = "polygon") + scale_fill_gradient(low = "#56B1F7", high = "#132B43") ``` Modify the code above to produce a heatmap of `assault` rates instead, with **orange colours** instead of blue colours for the gradient. Here's a document that may help you pick colors: [Hex colour picker](http://www.w3schools.com/tags/ref_colorpicker.asp) ```{r} # Make a copy of the data frame to manipulate arrests <- USArrests # Convert everything to lower case names(arrests) <- tolower(names(arrests)) arrests$region <- tolower(rownames(USArrests)) # Merge the map data with the arrests data based on region choro <- merge(states, arrests, sort = FALSE, by = "region") choro <- choro[order(choro$order), ] # Plot a map, filling in the states based on murder rate qplot(long, lat, data = choro, group = group, fill = assault, geom = "polygon") + scale_fill_gradient(low = "#FFAD33", high = "#1A0F00") ```