代做ETC3250 Introduction to machine learning 2024 Semester One代写留学生Matlab程序

2025.08.08 - 首页 >> Java编程

2024 Semester One (June 2024)

Examination Period

Faculty of Business and Economics

EXAM CODES: ETC3250

TITLE OF PAPER: Introduction to machine learning

EXAM DURATION: 2 hours 10 mins

Section A:

Information

Section A. Please answer ALL questions.

Question 1

Which of the following categorical response variables matches the binary matrix coding below:

Select one:

a. (A, A, B, C, C, A)'

b. (A, B, C, B, C, A)'

c. (B, A, C, A, A, C)'

d. None of these because the coding is not binary

e. (C, B, C, A, A, C)'

Question 2

Which of these plots would be considered the model plotted in the data space?

A: The line of points is an SVM boundary

B: Convex hulls marking the results of a -means clustering

C: Votes matrix from a random forest fit

Select one:

a. B

b. C

c. A and C

d. A

e. A and B

f. A and B and C

g. B and C

Question 3

The term _________ means the model overlaid on the data, with the primary purpose being to examine how well the model fits the main structures present in the data.

Select one:

a. model-in-the-data-space

b. biplot

c. principal component analysis

d. data-in-the-model-space

e. tours of linear projections

Question 4

Which of the following projection matrices match the axes for this projection:

Select one:

X1 X2 var

1 0.324 0.03325 tr1

2 0.033 0.84597 tr2

3 -0.079 -0.50492 hed

4 0.689 -0.00038 ad1

5 0.176 0.07742 ad2

6 -0.618 0.14931 ad3

none of them match

X1 X2 var

1 0.47 0.049 tr1

2 0.18 0.783 tr2

3 0.11 -0.593 hed

4 0.68 -0.158 ad1

5 -0.13 0.080 ad2

6 -0.50 -0.046 ad3

X1 X2 var

1 0.99673 0.00017 tr1

2 0.00017 0.99933 tr2

3 -0.00624 -0.03463 hed

4 0.05882 -0.00076 ad1

5 0.01496 0.00513 ad2

6 -0.05296 0.01092 ad3

X1 X2 var

1 0.496 0.139 tr1

2 0.446 0.505 tr2

3 0.330 -0.606 hed

4 0.243 -0.349 ad1

5 -0.621 0.034 ad2

6 -0.024 -0.485 ad3

Question 5

When doing 5 — fold cross-validation, with these splits of the data:

fold 1: 1, 3, 4

fold 2: 2, 10, 15

fold 3: 7, 8, 9

fold 4: 5, 11, 14

fold 5: 6, 12, 13

Which subset of observations would be used to train the model when working with fold 4?

Select one:

a. 1, 3, 4

b. 1, 2, 3, 4, 6, 7, 8, 9, 10, 12, 13, 15

c. 5, 11, 14

d. 7, 8, 9

e. 2, 10, 15

Question 6

From the following summary of a PCA, what proportion of the total variance would four principal components explain? (Note: The data was standardised prior to computing the PCA. If no values match exactly, pick the closest.)

> auswt20_pca$sdev

[1] 2.723 2.053 1.175 0.974 0.902 0.836 0.700 0.533

[9] 0.466 0.421 0.351 0.321 0.273 0.220 0.081 0.063

[17] 0.015

Select one:

a. 41%

b. 82%

c. 46%

d. 0.457

e. 0.82

f. 5.7%

g. 0.057

h. 0.407

Question 7

For data having n = 92 and p = 5, how many parameters would need to be estimated to compute the variance-covariance matrix?

Select one:

a. 14

b. 4

c. 92

d. 91

e. 24

f. 25

g. 15

Question 8

The following output summarises the results from PCA on player statistics women’s AFL matches from 2023. Statistics for each player have been averaged across the season. There are statistics on 508 players. PCA was computed on the correlation matrix.

a. (1pt) How many variables in the data?

b. (1pt) How many PCs have eigenvalues higher than would be expected from purely uncorrelated data?

c. (3pts) Which variables significantly contribute to PC1? Explain your answer.

d. (1pt) Is it reasonable to assume that the variables were standardised when computing the PCA? Why?

e. (2pts) Interpret PC1, in a few sentences.

Question 9

The following output summarises the first four PCs from PCA on player statistics women’s AFL matches from 2023. Statistics for each player have been averaged across the season. There are statistics on 508 players. PCA was computed on the correlation matrix.

Make a sketch that shows where the axis forbehinds would be on a biplot of PC1 vs PC2.

(3pts: 1.5 for correct line segment, 1.5 for labelling axes, and adding scales.)

Question 10

Explain in a few sentences what type of player Randall is.

Question 11

The following plots are produced from player statistics women’s AFL matches from 2023. Statistics for each player have been averaged across the season. There are statistics on 508 players. Both results were computed on standardised data.

Explain in a few sentences what would be learned from the UMAP representation of the AFLW statistics that might differ from that shown in the first two PCs.

Section B:

Information

Section B. Please answer ALL questions.

Question 12

From the following plot of data, what would likely be the pooled variance-covariance matrix?

VC1 | VC2 | VC3 | VC4

x1 x2 | x1 x2 | x1 x2 | x1 x2

x1 5.6 -3.0 | x1 1.03 0.98 | x1 5.4 2.9 | x1 1.14 -0.98

x2 -3.0 5.6 | x2 0.98 1.14 | x2 2.9 4.9 | x2 -0.98 1.03

Select one:

a. VC1

b. VC4

c. VC3

d. None of these

e. VC2

Question 13

From the following plot of data, and three possible boundaries from an LDA fit marked by A, B, C.

If the model is fitted with group 1 having a higher prior probability, which is likely the boundary for that model?

Select one:

a. None of these is possible

b. C

c. A

d. B

e. Either A or C is possible

Question 14

In the derivation of different forms of the equations for a logistic regression model:

What is the explanation of going from step 3 to 4?

Select one:

a. take natural log of both sides

b. subtract 1 from both sides

c. multiply numerator and denominator of LHS by y

d. ÷numerator and denominator by eβ0+β1x

e. invert both sides

Question 15

For two classes coded as 0 and 1, what would be the class prediction for the following logistic model fit?

Select one:

a. 0.0183

b. 1

c. 0

d. 0.881

e. 0.0180

Question 16

For the following random forest model fit, and votes matrix values for five observations, what would be the class prediction for observation 3?

> pebbles_rf

Call:

randomForest(formula = cl ~ ., data = pebbles)

Type of random forest: classification

Number of trees: 500

No. of variables tried at each split: 1

OOB estimate of error rate: 0.51%

Confusion matrix:

A B class.error

A 101 1 0.0098

B 0 94 0.0000

> pebbles_rf$votes[ids,]

A B

1 0.688 0.31

2 0.778 0.22

3 0.455 0.55

4 0.048 0.95

5. 0.133 0.87

Select one:

a. A and B are equally plausible

b. 0.778

c. A

d. B

e. 0.55

Question 17

The following values are the predictive probabilities of the test set for a random forest fitted to a data set with two classes. There are 196 observations, cl indicates true class. The values are sorted, and you can assume that rows 1-97 are all identical, and rows 108-196 are identical. If class B is the positive class, compute sensitivity for a cutoff of 0.7 (anything above 0.7 is predicted to be B).

id cl A B

1 A 1.00 0.00

...

97 A 1.00 0.00

98 A 0.80 0.20

99 A 0.78 0.22

100 A 0.75 0.25

101 A 0.69 0.31

102 A 0.45 0.55

103 B 0.22 0.78

104 B 0.13 0.87

105 B 0.08 0.92

106 B 0.06 0.94

107 B 0.05 0.95

108 B 0.00 1.00

...

196 B 0.00 1.00

Select one:

a. 0.98

b. 1

c. 0

d. 0.01

e. 0.02

Question 18

The following is a diagram for a neural network model. If the number of observations in the data set were n = 46, how many observations per parameter to be estimated in the model? (That is, divided by the number of parameters.)

Select one:

a. 4.2

b. 23

c. 4

d. 2

e. 8.4

f. None of these

Question 19

From the following summaries:

answer the following questions:

a. (1pt) What is the data dimension, ?

b. (1pt) What is the pooled variance-covariance, ?

c. (3pts) Compute and report the LDA rule to classify group A from group B, assuming equal prior probabilities.

Question 20

This summarises a tree fit to the last 25 years of Australian tourism data modeling the difference in patterns between Cairns and Melbourne. Only business travel is examined, and the four variables used are Q1, Q2, Q3, Q4 which are quarters in the year. Each series was standardised on itself, so values represent proportion of travel for business relative to other types of travel to the city in each quarter of each year. We are curious to determine whether business travel tends to be in different seasons in the two locations.

n= 50

node), split, n, loss, yval, (yprob)

* denotes terminal node

1) root 50 25 Cairns (0.50 0.50)

2) Q1< 0.19 25 2 Cairns (0.92 0.08)

4) Q2< 0.25 20 0 Cairns (1.00 0.00) *

5) Q2>=0.25 5 2 Cairns (0.60 0.40)

10) Q2>=0.27 2 0 Cairns (1.00 0.00) *

11) Q2< 0.27 3 1 Melbourne (0.33 0.67) *

3) Q1>=0.19 25 2 Melbourne (0.08 0.92)

6) Q3< 0.23 3 1 Cairns (0.67 0.33) *

7) Q3>=0.23 22 0 Melbourne (0.00 1.00) *

a. (1pt) How many terminal nodes in the tree?

b. (1pt) How many of the four variables are used in the model?

c. (1pt) Which variable would be considered to be the most important?

d. (1pt) Which terminal nodes are pure nodes (having only one class)?

e. (1pt) How many observations are there at node 7?

f. (2pts) Based on this model, how would you describe the differences in business travel between Melbourne and Cairns?

Question 21

This summarises a linear support vector machine fit to the last 25 years of Australian tourism data modeling the difference in patterns between Cairns and Melbourne. Only business travel is examined, and the four variables used are Q1, Q2, Q3, Q4 which are quarters in the year. Each series was standardised on itself, so values represent proportion of travel for business relative to other types of travel to the city in each quarter of each year. We are curious to determine whether business travel tends to be in different seasons in the two locations.

> melb_cairns_svm_b$fit@b

[1] 4.2

> melb_cairns_svm_b$fit@SVindex

[1] 2 3 5 6 16 18 19 21 22 26 27 30 34 40 41 48 49 50

> melb_cairns_svm_b$fit@coef

[[1]]

[1] -10.0 -10.0 -10.0 -10.0 -9.0 -10.0 -10.0 -10.0 -9.4 10.0

[11] 10.0 10.0 10.0 8.5 10.0 10.0 10.0 10.0

The top few rows of the data are:

> melb_cairns[,c(1,3,5,7,9)] |> slice_head(n=5)

# A tibble: 5 × 5

Region Q1 Q2 Q3 Q4

1 Cairns 0.259 0.220 0.629 0.300

2 Cairns 0.205 0.345 0.586 0.348

3 Cairns 0.272 0.475 0.500 0.275

4 Cairns 0.173 0.533 0.523 0.380

5 Cairns 0.360 0.498 0.565 0.374

The coefficients for the separating hyperplane plane are calculated to be:

> melb_cairns_betas_b

Q1 Q2 Q3 Q4

5.8 3.1 5.0 3.2

a. (1pt) How many support vectors are used to compute the coefficients for the separating hyperplane?

b. (1pt) Write down the equation of the separating hyperplane?

c. (1pt) Which variable(s) would be considered to be the most important to distinguish the difference between business trips to Cairns and Melbourne?

d. (2pts) Explain how you would use the quantities from the fitted model object to compute the coefficients.

e. (2pts) Was Melbourne or Cairns coded as -1? Why do you think so?