LogisticRegression model and throwing exception to predict on new dataset:

java.lang.IllegalArgumentException: requirement failed: The columns of A don't match the number of elements of x.
    ...
    regexTokenizer = RegexTokenizer(inputCol="TextColumn", outputCol="words")
    add_stopwords = [<list of Stopwords>]
    stopwordsRemover = StopWordsRemover(inputCol="words",outputCol="filtered").setStopWords(add_stopwords)
    countVectors = CountVectorizer(inputCol="filtered",outputCol="features",vocabSize=10000,minDF=5)
    pipeline = Pipeline(stages=[regexTokenizer,stopwordsRemover,countVectors])
    ...

Let me know how to apply model on new dataset.

0

There are 0 best solutions below