Practice Problem 5

The following practice problem is aimed to give you some practice with exploring data and running a linear regression on your own using statistical software. You are welcome to use any statistical software you wish and you are also free to work in groups of up to 3 for this practice problem. If you work in groups, please have every member of the group complete the ICON survey.

Instructions

What to turn in

Please turn in a document that includes any relevant statistics/figures created. You will be asked to complete a graded survey on ICON as part of this practice problem.

Finally, upload the final document to ICON and complete the graded survey.

Due Date

Due around April 15th, 2024. No penalty for late submissions as long as it is submitted by May 9th, 2024.

Data

The data for this activity is San Francisco rental data. The data originally come from Tidy Tuesday. I have done some processing to drop some missing data and remove some attributes from the larger data for our use.

Note: Use the data linked here or posted to the IDAS. The data can be obtained in csv format. A short description for each attribute is as follows.

Attribute Name Description
post_id Unique ID
date date
year year
nhood neighborhood
city city
county county
price price in USD
beds n of beds
sqft square feet of rental
room_in_apt room in apartment

Guiding Question

Does the number of bedrooms explain variation in the price of the San Francisco rental?

Questions

  1. Descriptively explore differences in the rental prices by the different number of bedrooms.

  2. Fit a linear regression to answer the question above. Use this model equation: price ~ beds.

  3. Fit a linear regression to answer the question above. Use this model equation: price ~ factor(beds).

Previous
Next