Introduction to Predictive Modeling: Regressions- SAS Miner Question
Predictive Modeling Using Regression
Return to the Chapter 3 Organics diagram. Attach the StatExplore tool to the ORGANICS data source and run it.
In preparation for regression, is any missing values imputation needed?
If yes, should you do this imputation before generating the decision tree models?
Why or why not?
Add an Impute node to the diagram and connect it to the Data Partition node. Set the node to impute U for unknown class variable values and the overall mean for unknown interval variable values. Create imputation indicators for all imputed inputs.
Add a Regression node to the diagram and connect it to the Impute node.
Choose Stepwise as the selection model and Validation Error as the selection criterion.
Run the Regression node and view the results.
Which variables are included in the final model?
Which variables are important in this model?
What is the validation ASE?
In preparation for regression, are any transformations of the data warranted?
Why or why not?
Disconnect the Impute node from the Data Partition node.
Add a Transform Variables node to the diagram and connect it to the Data Partition node.
Connect the Transform Variables node to the Impute node.
Apply a log transformation to the DemAffl and PromTime inputs.
Run the Transform Variables node. Explore the exported training data. Did the transformations result in less skewed distributions?
Rerun the Regression node.
Do the selected variables change?
How about the validation ASE?
Create a full second-degree polynomial model. How does the validation average squared error for the polynomial model compare to the original model?