I want to query a cassandra table using the spark-cassandra-connector using the following statements:
sc.cassandraTable("citizens","records")
.select("identifier","name")
.where( "name='Alice' or name='Bob' ")
And I get this error message:
org.apache.spark.SparkException: Job aborted due to stage failure:
Task 0 in stage 81.0 failed 4 times, most recent failure:
Lost task 0.3 in stage 81.0 (TID 9199, mydomain):
java.io.IOException: Exception during preparation of
SELECT "identifier", "name" FROM "citizens"."records" WHERE token("id") > ? AND token("id") <= ? AND name='Alice' or name='Bob' LIMIT 10 ALLOW FILTERING:
line 1:127 missing EOF at 'or' (...<= ? AND name='Alice' [or] name...)
What am I doing wrong here and how can I make an or query using the where clause of the connector?
Your
ORclause is not valid CQL. For this few key values (I'm assumingnameis a key) you can use anINclause.The
whereclause is used for pushing downCQLto Cassandra so only validCQLcan go inside it. If you are looking to do a Spark Side Sql-Like syntax check out SparkSql and Datasets.