Spark ALS model.transform(test) drops rows from test. What could be the reason?

227 Views Asked by Anmol Deep At 26 May 2022 at 13:55

test (a table with columns: user_id, item_id, rating, with 6.2M rows)

als = ALS(userCol="user_id",
                itemCol="item_id",
                ratingCol="rating",
                coldStartStrategy="drop",
                implicitPrefs=True)
model = als.fit(train)
predictions = model.transform(test)

predictions (a table with columns: user_id, item_id, rating, prediction, but with only 1.7M rows)

Why did model.transform(test) drop rest of the rows? It should have been able to calculate prediction score for all user_id, item_id combination, right?

Is it because I have used coldStartStrategy="drop"?

But if there is a rating calculated for all user_id, item_id combinations in test, no row should be dropped, yes?

Original Q&A

There are 1 best solutions below

Anmol Deep On 26 May 2022 at 14:59 BEST ANSWER

It's because I have used the coldStartStrategy="drop" option only. It's dropping rows corresponding to users and items which had no interactions corresponding to them in training data.

Spark ALS model.transform(test) drops rows from test. What could be the reason?

There are 1 best solutions below

Related Questions in PYSPARK

Related Questions in APACHE-SPARK-SQL

Related Questions in APACHE-SPARK-MLLIB

Related Questions in ALS

Trending Questions

Popular # Hahtags

Popular Questions