I tried using ParallelALSFactorizationJob, but it crashes here:
Exception in thread "main" java.lang.NullPointerException
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:445)
at org.apache.hadoop.util.Shell.run(Shell.java:418)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:739)
Command line help mentions using filesystem, but it seems it wants hadoop. How can I run it on Windows, mahout.cmd file is broken:
"===============DEPRECATION WARNING==============="
"This script is no longer supported for new drivers as of Mahout 0.10.0"
"Mahout's bash script is supported and if someone wants to contribute a fix for this"
"it would be appreciated."
So is that possible (ALS + Windows - hadoop)?
Mahout is a community-driven project and its community is very strong.
-Tiwary, C. (2015). Learning Apache Mahout.
Apache Spark is an open-source, in-memory, general purpose computing system that runs on both Windows and Unix like systems. Instead of Hadoop-like disk-based computation, Spark uses cluster memory to upload all the data into the memory, and this data can be queried repeatedly.
-Gupta, A (2015). Learning Apache Mahout Classification.
(This last book also provides a step by step guide Using Mahout's Spark shell (they don't use Windows and it isn't clear if they use Hadoop or not though). For more information on that topic, see the implementation section at https://mahout.apache.org/users/sparkbindings/play-with-shell.html.)
In addition to this, you can build recommendation engines using Spark such as DataFrames, RDD, Pipelines, and Transforms available in Spark MLlib and
-Gorakala, S. (2016). Building Recommendation Engines.
At this point, there's one question still to answer before answering your question: can we run Spark without Hadoop?.
So, yes, it's possible to use ALS method on Windows using Spark (without Hadoop).