--- title: Lecture "13" - Interactive graphics and intro to shiny author: Prof. Chouldechova date: 94842 output: html_document: toc: true toc_depth: 5 --- ```{r} library(tidyverse) library(plotly) # for interactive graphics library(DT) options(scipen = 4) ``` We'll illustrate some examples using a bunch of different data sets. - `flights`: This data contains information on all flights departing from one of the 3 NYC airports (EWR, LGA, JFK) in 2013. - `diamonds`: You've seen this one before - `txhousing`: This data contains information on the Texas housing market from 2000 - 2015 - `gapminder`: You've seen this one before ```{r} # You'll need to run install.packages("nycflights13") and # install.packages("gapminder") flights <- nycflights13::flights ``` ```{r} # Load the data from the gapminder library data(gapminder, package = "gapminder") ``` ## Interactive tables: datatable Sometimes it's helpful to output interactive summary or data tables into our reports. We can do this with the `datatable` function. ```{r} # Printing data flights %>% group_by(carrier, origin) %>% summarize(`Average delay (mins)` = round(mean(dep_delay, na.rm = TRUE), 0)) ``` ```{r} # datatable flights %>% group_by(carrier, origin) %>% summarize(`Average delay (mins)` = round(mean(dep_delay, na.rm = TRUE), 0)) %>% datatable(options(list(pageLength = 12))) ``` ## Interactive graphics with (gg)plotly One of the simplest ways to get started with interactive graphics in R is to use the `ggplotly` function in the `plotly` library. It converts ggplot objects into their interactive counterparts. Let's create some plots with ggplot and see what happens when we make them interactive. #### Bar charts ```{r} # Form a bar chart showing the number of flights from each airport p <- ggplot(flights, aes(x = origin)) + geom_bar() p ``` ```{r} ggplotly(p) ``` #### Box plots Here's a boxplot example which shows the distribution of departure delays across airports. ```{r} p <- ggplot(flights, aes(x = origin, y = dep_delay)) + geom_boxplot() + scale_y_continuous(trans='log2') p ``` ```{r} ggplotly(p) ``` Note that `plotly` is its own graphing library. It just happens to be particularly convenient to use `ggplotly`, because it enables us to make interactive graphics that we already have experience constructing. Here's an example of a ggplotly version vs a plotly version of the boxplot. I'm switching to the gapminder data because htmlwidgets are super resource intensive for large data. ```{r} p <- ggplot(gapminder, aes(continent, lifeExp, color=continent)) + geom_boxplot() ggplotly(p) ``` ```{r} plot_ly(gapminder, x = ~continent, y = ~lifeExp, color = ~continent, type = "box") ``` Here's how we would do log-scaling for a plotly plot. First, a plot without log scaling on the y-axis. ```{r} plot_ly(gapminder, x = ~continent, y = ~gdpPercap, color = ~continent, type = "box") ``` Now a plot with logarithmic y-axis scaling, as controlled through the `layout` command: ```{r} plot_ly(gapminder, x = ~continent, y = ~gdpPercap, color = ~continent, type = "box") %>% layout(yaxis = list(type = "log")) ``` #### Dot plots Now let's look at an example where we calculate the average departure delay for flights out of LGA for each destination airport, and produce a plot that contains that information. In this plot the dot size represents the number of flights from LGA to that destination. ```{r} p <- flights %>% filter(origin == "LGA") %>% group_by(dest) %>% summarize(av_dep_delay = mean(dep_delay, na.rm = TRUE), count = n()) %>% filter(count > 50) %>% mutate(dest = reorder(dest, av_dep_delay)) %>% ggplot(aes(x = dest, y = av_dep_delay, size = count)) + geom_point(alpha = 0.5) + scale_size_area() + ylab("Average departure delay") + xlab("Destination airport") + theme(axis.text.x = element_text(angle = 90, hjust = 1)) p ``` ```{r} ggplotly(p) ``` #### Scatterplots Now here's a scatterplot example with the diamonds data. We'll start by subsampling the data so we don't have so many points. The `sample_n` command makes it easy to sample a subset of the rows of the data. ```{r} diamonds.sub <- diamonds %>% sample_n(2000) ``` ```{r} p <- ggplot(diamonds.sub, aes(x = carat, y = price, color = color)) + geom_point() p ``` ```{r} ggplotly(p) ``` The default behavior for `ggplotly` is to provide the values of all aesthetic mappings in the hover text It is also possible to customize what gets displayed. The most general way of doing this is to specify a `text` argument that contains the information you want to see. In the example below we specify `text` to be the caract, clarity, color and cut of the diamond. The `paste` command pastes together values into a single string, with values separated by the `sep` argument. Setting `sep = "\n"` leads every element to be displayed on a new line. ```{r} p <- ggplot(diamonds.sub, aes(x = carat, y = price, color = color, text = paste(carat, clarity, color, cut, sep = "\n"))) + geom_point() p ``` ```{r} ggplotly(p, tooltip = "text") ``` ```{r} p <- ggplot(diamonds.sub, aes(x = carat, y = price, color = color)) + geom_point(alpha = 0.5) + geom_smooth() p ``` ```{r} ggplotly(p) ``` #### Line charts Here we'll have a look at how home sales have varied over time. We'll focus first on sales in Austin, TX. ```{r} p <- txhousing %>% filter(city == "Austin") %>% ggplot(aes(x = month, y = sales, group = year)) + geom_line() ggplotly(p) ``` ggplot and plotly make it really easy to create animations across time (or across any other variable of interest). To do this, you simply need to specify a `frame` variable. ```{r} p <- txhousing %>% filter(city == "Austin") %>% ggplot(aes(x = month, y = sales, frame = year)) + geom_line() ggplotly(p) ``` You can animate certain layers while keeping others static. It all depends on when you specify the `frame` variable. Here's an example where we have all of the years in the background, with the current year highlighted in blue. ```{r} p <- txhousing %>% filter(city == "Austin") %>% ggplot(aes(x = month, y = sales)) + geom_line(aes(group = year), alpha = 0.2) + geom_line(aes(frame = year), color = "steelblue", size = 2) ggplotly(p) ``` Let's have a look at several cities at the same time. Note that we're using the `animation_opts()` function here to change properties of the plotly animation. `frame` controls the amount of time between transitions (in milliseconds) ```{r} p <- txhousing %>% filter(city %in% c("Austin", "Dallas", "Houston", "San Antonio")) %>% ggplot(aes(x = month, y = sales)) + geom_line(aes(group = year), alpha = 0.2) + geom_line(aes(frame = year), color = "steelblue", size = 1) + facet_grid(. ~ city) ggplotly(p) %>% animation_opts(frame = 1000) ``` Through the animation options you can also change how the frames transition from one to the next by setting the `easing` parameter. There are many options. See [here](https://github.com/plotly/plotly.js/blob/master/src/plots/animation_attributes.js). ```{r, fig.width = 10} ggplotly(p) %>% animation_opts(frame = 1000, easing = "elastic") ``` #### Animating the gapminder data First we'll look at how life expectancy changes over time across countries. We'll start the animation in 1952, with the countries ordered by their minimum life expectancy. ```{r, fig.width = 15} p <- gapminder %>% mutate(country = reorder(country, lifeExp, function(.x) .x[1])) %>% ggplot(aes(x = country, y = lifeExp, color = continent, size = pop)) + geom_point(aes(frame = year)) + theme(axis.text.x = element_text(angle = 60, vjust = 1, hjust = 1)) ggplotly(p) %>% animation_opts(1000) ``` Here's an animated plot that shows life expectancy and GDP evolving over time. The `redraw = FALSE` option means that the base plot won't be redrawn at every transition. ```{r} p <- ggplot(gapminder, aes(x = gdpPercap, y = lifeExp, color = continent, size = pop)) + geom_point(alpha = 0.1) + geom_point(aes(frame = year, ids = country)) + scale_x_continuous(trans = "log10") ggplotly(p) %>% animation_opts(1000, redraw = FALSE) ``` ## Want to learn more? There's a ton more that one can do with interactive graphics (and tables!) in R. Some of the examples used in today's lecture were borrowed from [Carson Sievert's awesome slides](https://plotcon17.cpsievert.me/workshop/day2/#1). I encourage you to have a further look through those slides to see some of the other things you can do with ggplotly. Things like joint "brushing" and "filtering" are particularly useful if you're designing interactive dashboards. You should also have a look at `htmlwidgets`, which you can learn about [here](https://www.htmlwidgets.org/).