Homework 1 (Data Analysis Using R) Solution

$30.00

Description

Instructions:

  1. Copy your answer to each question AND the R code by which you reached your answer to a document. The answers should be correctly ordered.

  1. In addition to the basic document, you also need to submit a .txt le, see the instruction for Problem 1, part 2.

Problem 1 (3 pts): Consider the following data frame about the nal four teams in FIFA world cup 2014:

Country

Continent

W

D

L

Scored

Conceded

Star

Germany

Europe

5

1

0

17

4

Ozil

Argentina

South America

6

0

0

8

3

Messi

Netherlands

Europe

5

0

1

12

4

Robben

Brazil

South America

5

0

1

11

11

Neymar

  1. (2 pts) Use the function data.frame to reproduce this table as a data frame in R. After you have created this new object, assign it to final4 for later use.

  1. (1 pt) Use the function write.table to export the object final4 to a .txt le (use whatever name you like). You also need to submit this le to dropbox in Elearning.

Problem 2 (4 pts): Consider data set cars, which is available in R. Type cars directly to see what it looks like. It is a data frame consists of two variables speed and dist.

  1. (2 pts) Plot a scatter plot to view the relationship between speed and dist. Put speed on the horizontal axis and dist on the vertical axis. Also add a red colored straight line with an intercept of 0 and a slope of 3.

  1. (1 pt) Create a new data frame named cars.time consists of three variables speed, dist, and time. Use below equation to calculate the new variable time.

time = distance

speed

3. (1 pt) Find a subset whose observation’s time > 3, then assign it to danger.cars.

1

Problem 3 (3 pts): The Excel Workbook \USPopulation.xlsx” (source: United States Census Bureau) consists of population estimates of states in United States from 2010 to 2012, one sheet for each year. Your goal is to read all three years data into R, then create a single R data frame.

  1. Install the R package readxl from CRAN, and load this package to your current R session.

  1. (2 pts) First, use excel sheets function to check the name of all the sheets. Next, use read excel function to import the data stored in each sheet separately to R and assign the result to US2010, US2011, and US2012 respectively. To accomplish this you have to use read excel function three times with di erent name of each sheet. Check help page of read excel and excel sheets to learn how to use.

(Hint: If you changed your working directory to where \USPopulation.xlsx” is located, the path argument in both read excel and excel sheets is simply \USPopulation.xlsx”)

  1. (1 pt) Create a single data frame that has the following four variables: Geography, Population estimates of states in 2010, Population estimates of states in 2011, and Population estimates of states in 2012. The dimension of the data frame should be 52 by 4.

2