9

If I run a model (called clf in this case), I get output that looks like this. How can I tie this to the feature inputs that were used to train the classifier?

>>> clf.feature_importances_

array([ 0.01621506,  0.18275428,  0.09963659,... ])
Bach
  • 6,145
  • 7
  • 36
  • 61
Krishan Gupta
  • 3,586
  • 5
  • 22
  • 31

3 Answers3

15

As mentioned in the comments, it looks like the order or feature importances is the order of the "x" input variable (which I've converted from Pandas to a Python native data structure). I use this code to generate a list of types that look like this: (feature_name, feature_importance).

zip(x.columns, clf.feature_importances_)
Krishan Gupta
  • 3,586
  • 5
  • 22
  • 31
  • As larsmans says, the obvious answer is the right one. Glad I got some confirmation, thanks! – Krishan Gupta May 27 '14 at 22:42
  • 2
    Would you say what is `x` and what is `x.columns`? – Bach May 28 '14 at 07:22
  • 2
    What @Bach says and technically your answer is not answering your question, it is just code that generates tuples :) Moreover you seem to be using `pandas.DataFrame`s which actually don't work with `sklearn` (at the moment, unclear for the future afaik). – eickenberg May 28 '14 at 14:11
  • Ok @eickenberg good points. I added some clarification to the answer. – Krishan Gupta Jun 05 '14 at 04:33
3

You may save the result in a pandas data frame as follows:

pandas.DataFrame({'col_name': clf.feature_importances_}, index=x.columns).sort_values(by='col_name', ascending=False)

By sorting it in a descending manner we get a hint to significant features.

0

The order is the order of the features/attributes of your training/data set.

You can display these importance scores next to their corresponding attribute/features names as below:

attributes = list(your_data_set)

sorted(zip(clf.feature_importances_, attributes), reverse=True)

The output could be something similar:

[(0.01621506, 'feature1'),
(0.09963659, 'feature2'),
(0.18275428, 'feature3'),
...
...
Shaido
  • 27,497
  • 23
  • 70
  • 73