Practice Problem 2
The following practice problem is aimed to give you some practice with exploring data and running a linear regression on your own using statistical software. You are welcome to use any statistical software you wish and you are also free to work in groups of up to 3 for this practice problem. If you work in groups, please have every member of the group complete the ICON survey.
Instructions
What to turn in
Please turn in a document that includes any relevant statistics/figures created. You will be asked to complete a graded survey on ICON as part of this practice problem.
Finally, upload the final document to ICON and complete the graded survey.
Due Date
Due around March 4th, 2024. No penalty for late submissions as long as it is submitted by May 9th.
Data
The data for this activity comes from the Kaggle. The data contain 104 rows and 14 columns about possums collected from Australia. A data description for each column in the data is shown below.
The data can be obtained in csv format. A short description for each attribute is as follows. These data are also found within the “data” folder inside the IDAS.
variable | class | description |
---|---|---|
case | integer | Observation number. |
site | integer | site. |
Pop | character | Population, either Vic (Victoria) or other (New South Wales or Queensland).. |
sex | character | Sex of possum, either m (male) or f (female).. |
age | integer | Age. |
hdlngth | integer | Head length, in mm. |
skullw | integer | Skull width, in mm. |
totlngth | integer | Total length, in cm. |
taill | integer | Tail length, in cm |
footlgth | integer | foot length, in mm. |
earconch | integer | ear conch length, in mm. |
eye | integer | distance from medial canthus to lateral canthus of right eye, in mm. |
chest | integer | chest girth, in cm. |
belly | double | belly girth, in cm. |
Guiding Question
Does the tail length (taill
attribute) explain variation in the total length (totlngth
attribute) of the possum?
Questions
-
Fit a linear regression to answer the research question highlighted above. Interpret the intercept and slope of the linear regression. That is, what do these terms mean?
-
What are the r-squared and sigma estimates from the linear regression? Interpret these two values in the context of the problem. That is, what do these two terms mean in the context of the data and this problem?
-
Finally, in a couple of sentences, provide a summary of the overall model. Does the model appear to be useful to predict or explain variation in the total length of the possum with the tail length? Use statistics from the analysis steps above to support your answer.