There are many methods that we used to select questions from tons of survey questions.
1. Correlation
If 2 questions are highly correlated with each other, 1 question is enough to collect the information we need.
2. Factor Analysis
If 2 questions are going towards a similar direction, 1 question is enough to collect the information we need.
3. Relative importance from Shapley Value Regression (using R package relaimpo)
This method can provide a relative importance for each variable in the model, representing how importance of this variable is in the model. We need a Y which is the response of overall satisfaction question to apply the model.
http://rstudio-pubs-static.s3.amazonaws.com/7077_95c275edb54c4692a060fa68e9640ec5.html
http://www.slideserve.com/gudrun/relative-importance-of-predictors-with-regression-models
How to solve highly correlation issue(multicollinearity) for survey? #####used this one
http://www.r-bloggers.com/the-relative-importance-of-predictors-let-the-games-begin/
An example to explain the calculation for average contribution to R-square. (page 13)
http://www.predictiveanalyticsworld.com/sanfrancisco/2013/pdf/Day2_1550_Reno_Tuason_Rayner.pdf
4. Random Forest
This method can provide importance of each variable. The higher importance, the better. (not in a 100% scale)
https://www.r-bloggers.com/random-forest-variable-importance/
5. Cp Statistics
The closer of Cp to the number of predictors in the model, the better model we get. It is closely related to R-square and AIC.