I created a gradient boosted model in R using xgboost from a survey with multiple questions, and each question having 2 or more answer choices. It has outputted a feature importance plot, where each answer choice of every question has some Gain attached to it, representing how much it contributes to the model prediction. Is it legitimate to sum up the feature importances for each of the answer choices so that I can get an aggregate feature importance for each survey question?
Ex.
Feature Gain
question_1.answer_1 X
question_1.answer_2 Y
I would like to calculate the overall Gain and feature importance of question_1 by doing X+Y. Is that legitimate?