Assignment 1: Multiple Logistic Regression 3 pages
You will examine the use of multiple logistic regression in research. You create a multiple logistic regression model in SPSS and also interpret the output. You evaluate the effectiveness of multiple logistic regression.
Evaluate use of multiple logistic regression in research
Apply multiple logistic regression diagnostics
Apply multiple logistic regression assumptions
Apply methods to perform multiple logistic regression
Interpret results of multiple logistic regression
Much of the study of epidemiology and biostatistics addresses the following outcomes: disease or no disease, death or no death, exposure or no exposure. These are dichotomous outcomes, making multiple logistic regression a reasonable choice for evaluation of epidemiological data. Many doctoral epidemiology students, therefore, choose multiple logistic regression to analyze research data.
For this Assignment, review the articles in the reference.
Select one article “For the Assignment. If you use material from any of the below articles, cite it.
Write using the following key elements of the article you selected and put the key elements in bold as headings.
Identify variables: independent variable(s), dependent variable(s), and confounders.
What was the research question?
Why was multiple logistic regression used?
What was the main result(s)?
What was the interpretation?
What are your thoughts on the limitation(s) of the study?
References
Afifi, T. O., Cox, B. J., Martens, P. J., Sareen, J., & Enns, M. W. (2010). The relationship between problem gambling and mental and physical health correlates among a nationally representative sample of Canadian women. Canada Journal of Public Health, 110(2), 171–175.
Villar, J., Valladares, E., Wojdyla, D., Zavaleta, N., Carroli, G., Velazco, A., … Acosta, A. (2006). Caesarean delivery rates and pregnancy outcomes: The 2005 WHO global survey on maternal and perinatal health in Latin America. Lancet, 367(9525), 1819–1829.
Ang, R. P., & Huan, V. S. (2006). Relationship between academic stress and suicidal ideation: Testing for depression as a mediator using multiple regression. Child Psychiatry & Human Development, 37(2),133–143.
Laureate Education, Inc. (Executive Producer). (2012). Multiple logistic regression. Baltimore, MD: Author.
Assignment Part two : Multiple Logistic Regression in Action 4 pages should have 6 references.
Multiple logistic regression is a model that uses analysis of predictor variables to make predictions as to the likelihood of occurrences of an outcome.
For this Assignment, you use multiple logistic regression to analyze a dataset. You identify assumptions required by multiple logistic regression and evaluate whether they have been met by the data. Finally, you interpret your results and evaluate the use of multiple logistic regression.
The Assignment
1. Variables and variable selection (20 Points)
1. Use a table to list the variables, Sex, Age in Years, Serum Cholesterol, Obese, and Hypertension, and each of their levels of measurement. (10 Points)
2. Create new variables Age_Cat and Chole_Cat:
? Age_Cat: Convert Age in Years into a categorical variable with 2 categories, Less than 40, 40 and greater
? Chole_Cat: Convert Serum Cholesterol into 3 categories, Under 200, 200-299, and 300 and greater
Add the new variables to each record by coding the responses to the original variable using the assigned categories. Be sure that the variable view in SPSS has the correct information on the 2 new variables. (20points)
2. Simple Binary Logistic Regression ( 30 Points)
1. Use Hypertension as the dependent variable and Chole_Cat as the independent variable in the first model. Report the Odds Ratio and significance of the Odds Ratio for the relationship between the dependent and independent variables. (40 Points)
2. Use Hypertension as the dependent variable and Serum Cholesterol (the original variable) as the independent variable in the second model. Report the Odds Ratio and significance of the Odds Ratio for the relationship between the dependent and independent variables. (20 Points)
3. How does the level of measurement for the independent variable affect the outcome (include the OR and its significance in your response)? How does the level of measurement of the independent variable change your interpretation of the Odds Ratio? (20 Points)
3. Multivariate Logistic Regression (70 Points)
1. Run a multivariate binary logistic regression model using SPSS and Hypertension as the dependent variable, Chole_Cat, Age_Cat, Obese, and Sex as the Covariates. Include the output in your submission. (20 Points)
2. Identify the Odds Ratio and the significance of the Odds Ratio for each of the covariates. How has the relationship between Chole_Cat and Hypertension changed with the addition of the other variables (compare to the output from # 2a)? (25 Points)
3. Test the assumption that the model fits the data using the Hosmer-Lemeshow Goodness of Fit test. Interpret the Chi Square statistic given in the output of this test and state what it means in terms of the assumptions needed to use logistic regression with this data. (20 Points)
4. Rerun the logistic regression model from #3a and use the save function to create the following new variables: Predicted Probabilities, Deviance Residuals, and Cook’s Distance. Evaluate the model using these saved variables and the following Scatter Plots. (25 Points)
? Create a Scatter Plot of the Deviance Residuals (DEV) and the variable ID: Are there any outliers? What does this mean when evaluating your model?
? Create a Scatter Plot of Cook’s Distance (COO) and the variable ID: Are there any influential cases? What does this mean when evaluating your model?
? Create a Scatter Plot of Deviance (DEV) and the Predicted Probabilities (PRE). Discuss whether anything in this scatterplot could cause you some concern in terms of your model.
