I have data in SVMlight format (label feature1:value1 feature2:v2 ...) as such
talk.politics.guns a:12 about:1 abrams:1 absolutely:1
talk.politics.mideast I:4 run:10 go:3
I tried sklearn.load_svmlight_file but it doesn't seem to work with categorical string features and labels. I am trying to store it into pandas DataFrame. Any pointers would be appreciated.
You can do it by hand... One way you can convert the file you want in a DataFrame:
The result DataFrame with your example file: