FES 720 Introduction to R

LAB: Looking at Data

This is the lab associated with the lesson on Looking at Data in R.

Please upload your final completed lab on the Assignments page in Canvas, as per the instructions below.

You are welcome and expected to ask for help from the instructors if you get stuck: Please also come to the R Bootcamp on Friday—there is coffee and snacks!


We will practice looking at data using two different data sets. We will use the read.table() function to read them in from the course website (you will need to be connected to the information super-highway to do so).

(We will explore reading in data more carefully in a future lesson).

First, read in the Harry Potter data from the course website, using the following code. This will download and make the data available to you in your current R session.

# Read in the Harry Potter Movies Table 1
dat.hp <- read.table("http://www.simonqueenborough.info/R/data/harry-potter-movies.txt", sep = '\t', header = TRUE)

Table 1. Box office history for all Harry Potter movies.

Release Date Movie Production Budget Domestic Opening Weekend Domestic Box Office Worldwide Box Office
Nov 16, 2001 Harry Potter and the Sorcerer’s Stone $125,000,000 $90,294,621 $317,575,550 $974,755,371
Nov 15, 2002 Harry Potter and the Chamber of Secrets $100,000,000 $88,357,488 $261,987,880 $878,979,634
Jun 4, 2004 Harry Potter and the Prisoner of Azkaban $130,000,000 $93,687,367 $249,538,952 $796,688,549
Nov 18, 2005 Harry Potter and the Goblet of Fire $150,000,000 $102,685,961 $290,013,036 $896,911,078
Jul 11, 2007 Harry Potter and the Order of the Phoenix $150,000,000 $77,108,414 $292,004,738 $942,943,935
Jul 15, 2009 Harry Potter and the Half-Blood Prince $250,000,000 $77,835,727 $301,959,197 $935,083,686
Nov 19, 2010 Harry Potter and the Deathly Hallows: Part I $125,000,000 $125,017,372 $295,983,305 $960,283,305
Jul 15, 2011 Harry Potter and the Deathly Hallows: Part II $125,000,000 $169,189,427 $381,011,219 $1,341,511,219
Nov 18, 2016 Fantastic Beasts and Where to Find Them $180,000,000 $74,403,387 $234,037,575 $803,798,342

Then, read in the the results from the 2015 5k New Haven Road Race.

# Read in NHV road race results 2015.
dat.nhv <- read.table("http://www.simonqueenborough.info/R/data/race-data-full.txt", sep = '\t', header = TRUE)

Questions

A. Harry Potter data

  1. Display the column names of the Harry Potter data. How are they different from Table 1?

  2. How many columns are there in this data?

  3. How many rows?

  4. Display the structure of the data.

  5. Specifically, what class are the dollar columns?

  6. What class do they need to be for calculations?

B. New Haven road race data

  1. How many columns and rows does this dataset have? Use three different functions to return both or one of these numbers.

  2. What class is each column?

  3. Display the first 10 rows of the data.

  4. Display the last 3 rows of the data.

  5. How many different towns were represented in the race?

  6. How many males and females participated?


How to write up your answers

Please check the help page for a reminder, if you need to.