This is the lab associated with the lesson on Looking at Data in R.
Please upload your final completed lab on the Assignments page in Canvas, as per the instructions below.
You are welcome and expected to ask for help from the instructors if you get stuck: Please also come to the R Bootcamp on Friday—there is coffee and snacks!
We will practice looking at data using two different data sets. We will use the
read.table() function to read them in from the course website (you will need to be connected to the information super-highway to do so).
(We will explore reading in data more carefully in a future lesson).
First, read in the Harry Potter data from the course website, using the following code. This will download and make the data available to you in your current R session.
# Read in the Harry Potter Movies Table 1 dat.hp <- read.table("http://www.simonqueenborough.info/R/data/harry-potter-movies.txt", sep = '\t', header = TRUE)
Table 1. Box office history for all Harry Potter movies.
|Release Date||Movie||Production Budget||Domestic Opening Weekend||Domestic Box Office||Worldwide Box Office|
|Nov 16, 2001||Harry Potter and the Sorcerer’s Stone||$125,000,000||$90,294,621||$317,575,550||$974,755,371|
|Nov 15, 2002||Harry Potter and the Chamber of Secrets||$100,000,000||$88,357,488||$261,987,880||$878,979,634|
|Jun 4, 2004||Harry Potter and the Prisoner of Azkaban||$130,000,000||$93,687,367||$249,538,952||$796,688,549|
|Nov 18, 2005||Harry Potter and the Goblet of Fire||$150,000,000||$102,685,961||$290,013,036||$896,911,078|
|Jul 11, 2007||Harry Potter and the Order of the Phoenix||$150,000,000||$77,108,414||$292,004,738||$942,943,935|
|Jul 15, 2009||Harry Potter and the Half-Blood Prince||$250,000,000||$77,835,727||$301,959,197||$935,083,686|
|Nov 19, 2010||Harry Potter and the Deathly Hallows: Part I||$125,000,000||$125,017,372||$295,983,305||$960,283,305|
|Jul 15, 2011||Harry Potter and the Deathly Hallows: Part II||$125,000,000||$169,189,427||$381,011,219||$1,341,511,219|
|Nov 18, 2016||Fantastic Beasts and Where to Find Them||$180,000,000||$74,403,387||$234,037,575||$803,798,342|
Then, read in the the results from the 2015 5k New Haven Road Race.
# Read in NHV road race results 2015. dat.nhv <- read.table("http://www.simonqueenborough.info/R/data/race-data-full.txt", sep = '\t', header = TRUE)
A. Harry Potter data
Display the column names of the Harry Potter data. How are they different from Table 1?
How many columns are there in this data?
How many rows?
Display the structure of the data.
Specifically, what class are the dollar columns?
What class do they need to be for calculations?
B. New Haven road race data
How many columns and rows does this dataset have? Use three different functions to return both or one of these numbers.
What class is each column?
Display the first 10 rows of the data.
Display the last 3 rows of the data.
How many different towns were represented in the race?
How many males and females participated?
Please check the help page for a reminder, if you need to.