代做Dataset: Boston Housing代做Java语言

- 首页 >> Python编程

Dataset: Boston Housing

Dataset comprises town-level socio-economic data on the housing in 506 towns comprising Greater Boston including data on pollution levels.

Variable definitions are included in the Excel workbook containing the dataset.

Objective

To understand the determinants of the median value of housing in 506 towns comprising Greater Boston.

Tasks

(All tasks must be undertaken using Excel. You will then use the results generated by your data analysis to complete the Answer Sheet included in this Assessment Brief. Only the completed Answer Sheet including the required screenshots is to be submitted.)

1. Calculate descriptive statistics for CRIME, ROOMS, AGE, TAX, PTRATIO and VALUE.

2. Generate histograms for CRIME, ROOMS, AGE, TAX, PTRATIO and VALUE.

3. Undertake skewness/outlier analysis and normality tests for CRIME, ROOMS, AGE, TAX, PTRATIO and VALUE.

4. Generate the correlation matrix for CRIME, ROOMS, AGE, TAX, PTRATIO and VALUE.

5. Create a scatterplot for VALUE vs ROOMS; include a linear trendline with the equation of the line and R2.

6. Estimate a general regression model of VALUE using all the variables in the dataset.

7. Develop a specific regression model of VALUE that eliminates irrelevant variables and maximises R2(adj).

8. Undertake residual analysis for the estimated specific regression including residual plots and auxiliary regression analysis.

9. Complete the answer sheet and submit on Minerva (via TurnitIn).

Assignments should be a maximum of 2000 words in length.

All coursework assignments that contribute to the assessment of a module are subject to a word limit, as specified in the assessment brief. The word limit is an extremely important aspect of good academic practice, and must be adhered to. Unless stated otherwise in the relevant module handbook (if one has been provided), the word count includes EVERYTHING (i.e. all text in the main body of the assignment including summaries, subtitles, contents pages, tables, supportive material whether in footnotes or in-text references) except the main title, reference list and/or bibliography and any appendices.  It is not acceptable to present matters of substance, which should be included in the main body of the text, in the appendices (“appendix abuse”). It is not acceptable to attempt to hide words in graphs and diagrams; only text which is strictly necessary should be included in graphs and diagrams.

You are required to adhere to the word limit specified and state an accurate word count on the cover page of your assignment brief. Your declared word count must be accurate, and should not mislead. Making a fraudulent statement concerning the work submitted for assessment could be considered academic malpractice and investigated as such. If the amount of work submitted is higher than that specified by the word limit or that declared on your word count, this may be reflected in the mark awarded and noted through individual feedback given to you.

The deadline date for this assignment is 12:00:00 noon on Wednesday 14th May 2025.

Semester 2, 2024/25

Assessed Coursework: Answer Sheet

1. Descriptive Statistics (Worth 12%)

(i) Complete the following table:

CRIME

ROOMS

AGE

TAX

PTRATIO

VALUE

Mean

Median

Minimum

Maximum

1st Quartile

3rd Quartile

St Dev

CoV

Note: CoV = coefficient of variation

(ii) Which of the variables (CRIME, ROOMS, AGE, TAX, PTRATIO and VALUE) are most/least dispersed? Explain your answer.

2. Histograms (Worth 12%)

Insert histograms:

(i) CRIME

(ii) ROOMS

(iii) AGE

(iv) TAX

(v) PTRATIO

(vi) VALUE

3. Distributional Properties (Worth 10%)

(i) Is the distribution skewed for any of the variables (CRIME, ROOMS, AGE, TAX, PTRATIO and VALUE)? Explain your answer.

(ii) Is there evidence of any outliers in any of the variables (CRIME, ROOMS, AGE, TAX, PTRATIO and VALUE)? Explain your answer.

4. Correlation Analysis (Worth 10%)

(i) Correlation matrix

Complete the correlation matrix

CRIME

ROOMS

AGE

TAX

PTRATIO

VALUE

CRIME

ROOMS

AGE

TAX

PTRATIO

VALUE

(ii) Comment on the key points of the correlation analysis.

5. Scatterplot (Worth 6%)

(i) Insert the scatterplot for VALUE vs ROOMS including a linear trendline with the equation of the line and R2.

(ii) Comment on the scatterplot.

6. General Regression Model (Worth 18%)

Complete the following table for the general regression model of VALUE

Outcome: VALUE

Coefficient

Standard Error

T Stat

P-Value

Intercept

CRIME

ZONE

INDUSTRY

RIVER

NOX

ROOMS

AGE

DISTANCE

HIGHWAY

TAX

PTRATIO

DIVERSITY

LOWSTAT

Goodness of Fit

R2

R2(adj)

F statistic (P-value)

Comments on Key Points



站长地图