Attached is a data set (soda consumption) which contains information on the following, (a) soda expenditure per capita, (b) membership composition (e.g. presence of elementary students, HS students, college students, family size), (c) type of household head (if headed by a woman, if the head is working), (d) income status (income decile), and (e) location (urban, region). Income decile is a qualitative variable. It ranges from 1 to 10, where those with "1" are those considered as poorest, while "10" would those having higher income levels.
Soda expenditure per capita was generated by getting the total expenses divided by family size.
Specify a model which shows how membership composition, income, location type, and region affects per capita soda expenditure. Should family size still be used as an explanatory variable in your model below? Why or why not? Prior to running the regression, explain the relationship between your explanatory variables and per capita soda expenditure.
Based on your answer on a, run and show the predicted model below. Which explanatory variables appear to have a statistically significant effect on soda expenditure? Why or why not?