• Your solutions should be your own work and are to be handed in by yourself to the Statistical Science Departmental office by 1600hrs on MONDAY, 23rd FEBRUARY
Declaration: I am aware of the UCL Statistical Science Department’s regulations on plagiarism for assessed coursework. I have read the guidelines in the student handbook and understand what constitutes plagiarism. I hereby affirm that the work I am submitting for this in-course assessment is entirely my own.
(a) Download the file lungfunction.dat from the G3 Moodle page. Read the data into R using read.table and then name the columns as fev, height, inhaler, age, exercise. (b) Obtain summary statistics for each quantitative variable and make useful plots of the data | i.e., that are relevant to the objectives of the study. Such plots may include, but are not necessarily restricted to, pairwise scatter plots with different plotting symbols for those who have or haven’t used an inhaler recently. Put plots together in a single figure where appropriate and consider possibly using log scales for the quantitative variables. (c) Find a linear model that enables fev to be predicted from the other variables and that is not more complicated than necessary. You may wish to consider using log transformations of one or more of the explanatory variables. All your models should be fitted using the lm function, and wide range of models should be considered to make your choice of model convincing with the use appropriate diagnostics to assess them. Ultimately you are required to recommend a single model that is suitable for interpretation and to justify your recommendation. (d) Write a brief report on your analysis in three sections: I Describe briefly what you found in your exploratory analysis in part (a) II Describe briefly (without too many technical details) what models you considered in part (b) and why you chose the model you did, and III State your final model clearly and describe it in words. Remember to include an estimate of the error standard deviation and say what this means. Give an estimate of what would be the effect on the average FEV1 by being older (e.g, by 1 year of age). Give an appropriate assessment of the uncertainty in your estimate
and
The log-likelihood of µ and σ given a set of observations w1; : : : wn is
The function I(C) is the indicator function, taking the value 1 if the condition C is true and 0 if the condition C is false. (a) Download the data trnormal.dat from the G3 Moodle page. Read it into R using scan. (b) Obtain summary statistics for the data and plot a histogram. (c) Write a function called negll that takes two arguments (i) params, a vector containing the values of the two parameters (µ; σ), and (ii) dat, a vector w of the data, and returns the negative log-likelihood, -l(µ; σjw). (Hint R functions pnorm and dnorm maybe useful in computing the negative log-likelihood.) (d) Use your function negll to evaluate and print out the negative log-likelihood for the data in trnormal.dat for a few sensible values of µ and σ. (e) Use the R function nlm to find and print out the maximum likelihood estimates of µ and σ for the data in trnormal.dat by minimising the negative log likelihood. (f) Obtain and print out approximate standard errors for these estimates.
#1
colnames(data)=c("fev", "height","inhaler", "age", "exercise")#给列名赋值
summary(data)
cor(data)#查看各个变量之间的关系
plot(data)
attach(data)#绑定数据
boxplot(fev ~ inhaler,
col = "yellow",
main = "inhaler与fev箱线图",
xlab = "inhaler",
ylab = "fev",
xlim = c(0, 3), ylim = c(5, 9), yaxs = "i")
boxplot(fev ~ height,
col = "red",
summary(lm1)
#使用向前向后线性拟合剔除无关变量
lm2=step(lm1,direction="both")
summary(lm2)
#对变量进行log变换
lm3=lm(fev~height+inhaler+age+log(inhaler),data=data)
#2
#用几个参数进行测试
l=negll(c(1,1),data)
l=negll(c(2,2),data)
l=negll(c(1,3),data)
#用极大似然法估计negll函数的参数
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。