This course offers an introduction to advanced topics in statistics with the focus of understanding data in the behavioral and social sciences. We will cover a range of methods such as regression, mixed effects models, and generalized linear models. You will learn how these methods work, as well as how to implement them using the statistical computing environment R. In addition to these more traditional methods for analyzing data, we will also discuss simulation methods (e.g. Monte Carlo, bootstrapping), and Bayesian statistics.
 Tobi Gerstenberg  Andrew Lampinen  ShaoFang (Pam) Wang  Mona Rosenke 

Role:  Instructor  Teaching assistant  Teaching assistant  Teaching assistant 
Email:  gerstenberg@stanford.edu  lampinen@stanford.edu  shaofang@stanford.edu  rosenke@stanford.edu 
Office:  302  316  409  424 
Office hours:  Monday 23pm  Friday 12:301:30pm  Wednesday 12pm  Tuesday 1:002:00pm 
Section:  Friday 1:302:20pm in 160314 
Wednesday 2:303:20pm in 160326 
Lectures: The class meets Monday, Wednesday, and Friday 10:3011:50am in 200203 in the History Corner.
Here is what you need to get ready for class.
Week  Day  Date  Topic  Content  Reading  Resources  Datacamp 

1  M  7Jan  Introduction  • Course introduction  • Cheatsheet R Studio • Cheatsheet R Markdown 1 • Cheatsheet R Markdown 2 • R Markdown for class reports 
• Introduction to R • RStudio IDE 1 • RStudio IDE 2 • RMarkdown 

W  9Jan  Visualization I  • Best practices • Introduction to RStudio • Introduction to library(ggplot2) • Reporting results using Rmarkdown 
• Data visualization (#1) • Data visualization (#3) 
• Cheatsheet ggplot2  • ggplot part 1 • ggplot part 2 • Reporting 

F  11Jan  Visualization II  • Making nice plots 
• Data visualization (#4) • Data visualization (#8) • R for Data Science (#27) 
• Cheatsheet shiny  • ggplot part 3 • Shiny 1 • Shiny 2 

2  M  14Jan  Data wrangling I  • Introduction to library(dplyr) • Data manipulation • select() , filter() , arrange() , mutate() 
• R for Data Science (#915)  • Cheatsheet base R • Cheatsheet data tansformation 
• dplyr • tidyverse • cleaning data • cleaning data: case studies 
Tue  15Jan  Homework 1 due at 8pm  • Visualization  
W  16Jan  Data wrangling II  • group_by() , summarize() • gather() , spread() • Joining tables, left_join() • Read and save data 
• R for Data Science chapters (#1721) • Data visualization (#5) 
• Cheatsheet strings • Cheatsheet data import 
• joining tables • writing functions • importing data 1 • importing data 2 

F  18Jan  Probability  • Introduction to probability theory • Conditional probability • Bayes’ rule 
• probability puzzles in R  
3  M  21Jan  no class (Martin Luther King, Jr. Day)  
Tue  22Jan  Homework 2 due at 8pm  • Data wrangling & visualization  
W  23Jan  Simulation I  • Probability distributions • Generating data 
• Foundations of Probability in R  
F  25Jan  Simulation II  • Central limit theorem • Sampling distributions • pvalues • Confidence intervals 
• Foundations of Inference  
4  M  28Jan  Modeling data  • Hypothesis testing as model comparison • Errors and parameter estimates • Statistical inferences about parameters 
• Data analysis: A model comparison approach to regression, ANOVA, and beyond (#14)  • statistical modeling 1  
Tue  29Jan  Homework 3 due at 8pm  • Probability and simulation  
W  30Jan  Linear model I  • Correlation • Simple regression 
• statistical modeling 2 • correlation 

F  1Feb  Linear model II  • Multiple regression • Interpreting interactions 
• Data visualization (#6) 

5  M  4Feb  Linear model III  • Analysis of Variance • Followup tests 
• modeling  
Tue  5Feb  Homework 4 due at 8pm  
W  6Feb  Linear model IV  • Planned contrasts  • inference in regression  
F  8Feb  Power analysis  • Making statistical decisions • Calculating effect sizes • Calculating power 
• Cheatsheet apply functions  • functional programming  
6  M  11Feb  Bootstrapping  • Computing confidence intervals • Visualizing uncertainty 

W  13Feb  no class Midterm due at 12pm (noon) 

F  15Feb  Linear model V  • Model assumptions • Model evaluation • Crossvalidation • BIC, AIC 
• multiple regression  
7  M  18Feb  no class (Presidents’ Day)  
W  20Feb  Linear mixed effects model I  • mixed effects model  
Thu  21Feb  Project proposal due at 8pm  
F  22Feb  Linear mixed effects model II  
8  M  25Feb  Linear mixed effects model III  
Tue  26Feb  Homework 5 due at 8pm  • Modeling data  
W  27Feb  No class  
F  1Mar  Generalized linear model  • Logistic regression • Generalized mixed effects model 
• multiple regression • generalized linear model • categorical data 

9  M  4Mar  Bayesian Data Analysis I  • Prior, likelihood, posterior  • Bayesian inference  
Tue  5Mar  Homework 6 due at 8pm  • Linear mixed effects models  
W  6Mar  Bayesian Data Analysis II  • Testing hypothesis  
F  8Mar  Bayesian Data Analysis III  • Comparing models  
10  M  11Mar  Course summary and outlook  
W  13Mar  Guest lecture: Prof Justin Gardner  
Thu  14Mar  Homework 7 due at 8pm  • Bayesian data analysis  
F  15Mar  Guest lecture: ShaoFang Wang and Mona Rosenke  
Th  21Mar  Final project presentations (8:30am  11:30am) Written final project report due at 10pm 
You will learn how to use R to …
Understand the philosophy behind null hypothesis significance testing (NHST) and Bayesian statistics through …
Formulate research questions as statistical models and …
Communicate what you have learned about your data …
Contribute to open and reproducible science through …
In “A Vision for Stanford”, university president Marc TessierLavigne states that Stanford wants to be
“an inspired, inclusive and collaborative community of diverse scholars, students and staff, where all are supported and empowered to thrive.”
Let’s try our best together in this class to make this happen!
I will …
You will …
For many classes, there will be readings and/or accompanying online interactive tutorials. We won’t adopt a course textbook.
Here is a list of useful resources:
Course notes:
The course notes are available as an online book here.
Free online books:
ggplot2
, dplyr
, sampling methods, …).Text books:
Here are some sources for finding interesting data sets:
Please familiarize yourself with Stanford’s honor code. We will adhere to it and follow through on its penalty guidelines.
When is the weekly homework due?
Each week, we will make the homework available on Wednesday after class. The homework is then due on Tuesday 8pm the week after.
Can we work in groups?
Work for the course will include both homework assignments and a final project.
What if I can’t make a section?
We offer two sections per week. If you can’t make the section that you’ve been assigned to, then please go to the other section. If you can’t make either section, make sure to get the section materials and go through them on your own.
Students who may need an academic accommodation based on the impact of a disability must initiate the request with the Office of Accessible Education
(OAE). Professional staff will evaluate the request with required documentation, recommend reasonable accommodations, and prepare an Accommodation Letter for faculty dated in the current quarter in which the request is being made. Students should contact the OAE as soon as possible since timely notice is needed to coordinate accommodations. The OAE is located at 563 Salvatierra Walk (phone: 7231066, URL: http://oae.stanford.edu).
Stanford is committed to ensuring that all courses are financially accessible to its students. If you require assistance with the cost of course textbooks, supplies, materials and/or fees, you should contact the Diversity & FirstGen Office (DGen) at opportunityfund@stanford.edu to learn about the FLIbrary and other resources they have available for support.
Stanford offers several tutoring and coaching services:
We welcome feedback regarding the course at any point. Please feel free to email us directly, or leave anonymous feedback for the teaching team by using our online form.