Fall 2020

What are we trying to accomplish?

The sample analysis was shown only in class and is not viewable in this version of the notes.


  • Course overview

  • Introduction to R, RStudio and R Markdown

  • Programming basics

How this class will work

  • No programming knowledge presumed

  • Some stats knowledge presumed. E.g.:
    • Hypothesis testing (t-tests, confidence intervals)
    • Linear regression
  • Synchronous attendance is encouraged, but not required

  • Class will be very cumulative


  • Two 80 minute lectures a week:
    • First 60-70 minutes: concepts, methods, examples
    • Last 10-20 minutes: short labs
  • Class participation (10%)
  • Quizzes (10%)
  • Weekly homework (40%)
  • Final project (2.5 weeks) (40%)
    • Disclaimer: To pass the class, you must achieve a passing score on the final project
      (at least 21 / 40)


  • Class participation (10%)
    • Labs: Each lecture has an accompanying lab assignment.
    • Course website shows how participation grade will be calculated
  • Quizzes (10%)
    • 4 quizzes in the second half of term. Dates TBA.
  • Homework assignments (40%)
    • There will be 5 weekly HW assignments
    • Single lowest HW score will be dropped
    • HW assigned on Wednesdays, due Wednesdays at 1:30PM ET
    • Late homework will not be accepted for credit
  • Final project (40%)
    • You will write a report analysing a policy question using a publicly available data set

Course resources

  • Assignments, office hours, class notes, grading policies, useful references on R: http://www.andrew.cmu.edu/~achoulde/94842/

  • Canvas for gradebook and for turning in homework

  • Piazza for discussion forum (embedded in Canvas)
    • Please post class/homework related question on Piazza instead of emailing the teaching staff
  • Check the class website for everything else

  • No required textbook, but I highly recommend:

Goal of this class

This class will teach you to use R to:

  • Generate graphical and tabular data summaries

  • Efficiently manipulate data using tidyverse libraries

  • Perform statistical analyses (e.g., hypothesis testing, regression modeling)

  • Produce reproducible statistical reports using R Markdown

  • Near the end of class we’ll also preview how to integrate R with other tools (e.g., databases, web, etc.)

Why R?

  • Free (open-source)

  • Programming language (not point-and-click)

  • Excellent graphics

  • Offers broadest range of statistical tools

  • Easy to generate reproducible reports

  • Easy to integrate with other tools

The R Console

Basic interaction with R is through typing in the console

This is the terminal or command-line interface

The R Console

RStudio is an IDE for R

RStudio has 4 main windows (‘panes’):

  • Source
  • Console
  • Workspace/History
  • Files/Plots/Packages/Help

RStudio is an IDE for R

RStudio has 4 main windows (aka ‘panes’):

  • Source
  • Console
  • Workspace/History
  • Files/Plots/Packages/Help

RStudio: Panes overview

  1. Source pane: create a file that you can save and run later

  2. Console pane: type or paste in commands to get output from R

  3. Workspace/History pane: see a list of variables or previous commands

  4. Files/Plots/Packages/Help pane: see plots, help pages, and other items in this window.

Console pane

  • Use the Console pane to type or paste commands to get output from R

  • To look up the help file for a function or data set, type ?function into the Console
    • E.g., try typing in ?mean
  • Use the tab key to auto-complete function and object names

Source pane

  • Use the Source pane to create and edit R and Rmd files

  • The menu bar of this pane contains handy shortcuts for sending code to the Console for evaluation

Files/Plots/Packages/Help pane

  • By default, any figures you produce in R will be displayed in the Plots tab
    • Menu bar allows you to Zoom, Export, and Navigate back to older plots
  • When you request a help file (e.g., ?mean), the documentation will appear in the Help tab

RStudio: Source and Console panes

RStudio: Console