Everyday R code (16) survey question selection technique

There are many methods that we used to select questions from tons of survey questions.

 

1. Correlation

If 2 questions are highly correlated with each other, 1 question is enough to collect the information we need.

 

2. Factor Analysis

If 2 questions are going towards a similar direction, 1 question is enough to collect the information we need.

 

3. Relative importance from Shapley Value Regression (using R package relaimpo)

This method can provide a relative importance for each variable in the model, representing how importance of this variable is in the model. We need a Y which is the response of overall satisfaction question to apply the model.

http://rstudio-pubs-static.s3.amazonaws.com/7077_95c275edb54c4692a060fa68e9640ec5.html

http://www.slideserve.com/gudrun/relative-importance-of-predictors-with-regression-models

How to solve highly correlation issue(multicollinearity) for survey? #####used this one

http://www.r-bloggers.com/the-relative-importance-of-predictors-let-the-games-begin/

An example to explain the calculation for average contribution to R-square. (page 13)

http://www.predictiveanalyticsworld.com/sanfrancisco/2013/pdf/Day2_1550_Reno_Tuason_Rayner.pdf

 

4. Random Forest

This method can provide importance of each variable. The higher importance, the better. (not in a 100% scale)

https://www.r-bloggers.com/random-forest-variable-importance/

 

5. Cp Statistics

The closer of Cp to the number of predictors in the model, the better model we get. It is closely related to R-square and AIC.

Be the first to comment

Leave a Reply

Your email address will not be published.


*