I have an svmlight-formatted file with values of the form:
92.91 18256731:1 71729421:1 72329637:1 83328561:1 118265976:1 134892759:1 198163358:1 352348616:1 526943048:1
5.30 102156934:1 134892759:1 198163358:1 254112843:1 262373758:1 512748316:1 526943048:1
22.00 32172600:1 72329637:1 118265976:1 134892759:1 198163358:1 411824213:1 443226486:1 445371412:1 526943048:1
I am trying to import this in h2o using h2o.import_file(fname.svmlight)
Does h2o support high dimensional sparse binary features?
Do I need to convert the hashed values in some indexes for this to work?
Your three lines of svmlight is like a virus! According to
topthe java process is as close to 800% CPU (8-core machine) as it can get. After 45m of cpu effort (5-6 mins wall clock) I had to usekill -9on it to get my machine back.Even if your type of file is not officially supported, I think the fact that it brings down a machine makes it a serious bug, so I've reported it here: https://0xdata.atlassian.net/browse/PUBDEV-4798
BTW, you can find a unit test showing use of smvlight here: https://github.com/h2oai/h2o-3/blob/30f382efac687be3959a253d975cb48c341c92b4/h2o-r/tests/testdir_misc/runit_parser_type.R