F&ES 720a Introduction to R

M & W 14:30 – 15.50, Kroon G01, Lecture and lab

F 10:00 – 13:00, Kroon 319, R Bootcamp (~office hours), attendance is optional but recommended

Lecturer: Simon Queenborough

TF: Andrew Muehleisen


This course will teach you how to use the computer programming language R.

Many natural and social scientists use R to explore, analyze, and present their data.

This course is designed to help you learn and understand a new language, as well as provide guidance on best practice.

Much like learning any new language, it will often be frustrating as you grapple with new words, meanings, and the nuts and bolts of specific grammar and syntax. However, the end result is beautiful. A whole new world will be open before you, and you will be equiped with a powerful tool and principles to guide you through it.

The course assumes no prior knowledge of R, programming, or how we will interact with your computer via the command line interface.

By the end of the course you will be able to:

Programming, writing code, command line interface, and collection, storage, analysis and display of data are all transferable skills that will be useful whatever software you end up using or work you end up doing.

Please note, this is not a statistics course. Links to background material will be provided, but we will not teach you statistics.


Deliberate Practice, R Proficiency, and Grades

The literature on learning suggests that three elements are helpful to learning a new skill quickly: repetition, assessment, and rapid feedback.

R is the perfect environment to learn the R language. We will engage in repeating tasks and new parts of the language every week. These commands are assessed immediately by R when they are entered in the program, but R does not in itself provide helpful assessment or feedback. We will use a program called SWIRL to provide immediate assessment and helpful feedback in R as you work through the lessons.

The labs will repeat much of the material of the lesson, but with new data and without the feedback from SWIRL. Submitted labs will be assessed regularly throughout the week, and multiple submissions are encouraged.

Finally, analysing your own, or another’s, data will motivate you to try out the code and analyses outside of class, to explore these data and reinforce the class material. Sufficiently motivated students could collaborate on a peer-reviewed publication.

The lecturer and TF will be available for any and all questions during class time.

Friday R Bootcamps are designed to provide extra time with the instructors as students work through lessons, labs, or assignments. We encourage all students to attend and complete labs and assignments and get graded in the bootcamps.

Approaches and grades

There are four main (jokingly-titled) approaches to this course:

ExperimenteR

You attend at least one class during shopping period and decide to take the c(o)urse next year.

Occasional useR

You want to learn R but may have no immediate need. You may have no data of your own and no interest in working toward a collaborative publication. You will learn the basics of a new language, complete all the lessons, labs, best-practice assignments, but not put much effort into the data project. This approach will probably result in a Pass (work of acceptable character).

Regular useR

You have data! You want to learn R to analyse it! You will demonstrate proficiency in the material, demonstrate understanding of best practice, and contribute to the data project in a meaningful way, either towards your own data or a group project. This will probably result in a High Pass (work of outstanding character, above average).

Addiction/Dependency

By the end of the semester, you have developed proficiency in the material covered in the course, as well as self-learning skills and are happy to teach yourself further material. Your data project report will highlight these skills and best practice. This approach will probably result in a Honors (work of exceptional character, professional-level).


Learning Goals & Outcomes

Successful students will be able to:

  1. Foundational knowledge (information and ideas)
  1. Application (skills, thinking and project management)
  1. Integration (connecting ideas)
  1. Human dimension (learning about oneself and others)
  1. Values (developing new feelings, interests and values)
  1. Learning how to learn (becoming a better student and self-directed learner)

Teaching and Learning Style

This course will provide an overview and introduction to the statistical software R. Class time will primarily be used for working through examples and problems as a class or individually. The best way to improve and feel comfortable in R is to use it frequently and regularly.

We will move through topics in the sequence below. How long we spend on each section depends how comfortable everyone is. Reading and problems will be assigned in class, on the course website, and via email each week. Infrequent guest lectures on specific topics will also occur.

After October recess we will choose some advanced topics that will be useful to students as well as work on one data project. Data projects can be done singly or in groups. You may bring your own data, or we will have some data sets available that will help partners of FES in their work. Contribution to a peer-reviewed publication is a possibility.


Assessment

The course will be assessed via:

There are no examinations.

Lessons

Lessons are short (20-30 minute) scripts that each student works through independently, during class and/or afterwards in their own time or during the R Bootcamp sessions.

These lessons will walk through R commands and ideas, providing direct real-time feedback as the student writes code in RStudio.

Students must complete all lessons.

Students will have five days to complete each lesson (i.e., lessons must be submitted by Friday of each week).

Labs

Each lesson will have an associated lab that reinforces the material and develops understanding and coding skills.

Students must complete all labs.

Labs will be graded frequently and may be retaken as often as the student wants.

The lab final grade will be recorded one week after the lab is released (i.e., the Friday after the associated lesson is due).

Best Practice of the Week

Each week, students will be assigned a task that reflects best practice.

These tasks will include revising a figure, cleaning some code, cleaning a data set, writing up a statistical test result.

Data project

The goal is to use this assignment either to advance the analysis of your own data in a single-author project, or data from a collaborator as a group project.

Several data sets will be made available, including from the Wildlife Conservation Society and African People & Wildlife.

Students will document their analysis of an existing data set.

The data set could be

Identical questions should not have been addressed with the specific data set before, but please talk to us if you have any questions.

The projects will proceed in two stages.

1. proposal

Before October recess (Oct 17 2017, 23:59), please submit a proposal (max. 1 page A4).

Include the following sections:

  1. Introduction A short introductory paragraph describing the scientific rationale for the question.

  2. Question The specific scientific question/s or hypothesis/es you will address (e.g., what is the effect of providing cookies on student attendance at class?).

  3. Data set A brief description of the data (e.g., unit of observation, number of observations, summaries of covariates and response variables), and either a link to the dataset or the dataset itself (if not online and data-sharing is agreed with the data owner).

We will check that the questions and data are feasible and appropriate.

Group data projects will be presented during class either by the Lecturer or talk by a collaborator.

2. project

Students will submit three documents for each project (due in December):

  1. The raw data set or link to it (please see Simon or Andrew if there are issues of data sharing).

  2. A file of documented R code, walking through the whole analysis, from data entry to completed publication quality figures and/or tables. The length of this document will depend on the analysis. The file should be plain-text, either .txt or .R.

  3. A summary document written in the style of an academic paper, containing the question, and methods and results sections pertinent to that question. This summary should be no more than 1 page of A4. The document should be submitted as a PDF.

Grades will be based on:


Summary of Schedule

  1. Why R? Introduction and overview of R; Basic Building Blocks
  1. Exploring and Visualising Data
  1. Statistical Analysis of Data
  1. Advanced Statistics & Programming in R
  1. Specialised Topics in R (by request, e.g.):

Weekly Plan

Sat/Sun: Reading before class

Mondays: Background lecture on topic and lessons

Wednesdays: Best Practice lecture and labs

Fridays: R Bootcamp


University Standards of Conduct

Academic Integrity

Students are expected to comply with the Yale Graduate School’s Programs and Policies, especially that on personal conduct.

In particular, students should note the following:

The Graduate School specifically prohibits the following forms of behavior by graduate students:

1. Cheating on examinations, problem sets, and any other form of test; also, falsification and/or fabrication of data.

2. Plagiarism, that is, the failure in a dissertation, essay, or other written exercise to acknowledge ideas, research, or language taken from others.

3. Multiple submission of the same work without obtaining explicit written permission from both instructors before the material is submitted. 

With regards to this class, all work submitted for assessment should be the individual student’s own work.

Respect

The Yale Community is diverse—in race, background, age, religion, and in many other ways. This is certainly the case at F&ES, where more than 30% of students are international, and our domestic students come from a wide range of backgrounds. The personal actions of each community member must maintain and foster an inclusive and supportive environment that is respectful of our diverse community. Principles of free speech remain paramount at Yale, but it is vital that the learning environment that is welcoming to students of all backgrounds, and is free of conscious or unintended bias or harassment. Respect for the rights and dignities of all members of our community, regardless of their differences, is paramount.

Please familiarize yourself with Yale’s Equal Opportunity Statement and Statement on Sexual Harassment

Sexual Assault and Harassment

Eradicating sexual misconduct is of the very highest priority at Yale. Therefore, please familiarize yourself with Yale’s definitions, policies, procedures, and resources for preventing and responding to sexual misconduct:  

  1. The Yale Sexual Misconduct Policies and Related Definitions outline behaviors that need to be reported.  If you are unsure whether an incident does (or could be perceived to) fall within the University definition of sexual misconduct, you should consult with the F&ES Title IX coordinator to make a determination.

  2. The University’s Sexual Misconduct Response website summarizes options for reporting and responding to sexual misconduct, as well as links to more detailed information.

  3. The Preventing and Responding to Sexual Misconduct booklet includes the definitions and resources above, and offers additional guidance on effective prevention, intervention, and response.

  4. The Rights and Options handout.


Updated: 2017-08-29