Apparently, the LSHModel of MLLib from spark 2.4 supports Spark Structured Streaming (https://issues.apache.org/jira/browse/SPARK-24465).
However, it's not clear to me how. For instance an approxSimilarityJoin from MinHashLSH transformation (https://spark.apache.org/docs/latest/ml-features#lsh-operations) could be applied directly to a streaming dataframe?
I don't find more information online about it. Could someone help me?
You need to
modelFitted) somewhere accessible to your Streaming job. This is done outside of your streaming job.df) withIt might be required to get the streaming Dataframe into the correct format to be used in the model prediction.