Apache Sedona Version Issues

50 Views Asked by At

So I'm trying to set up Apache Sedona but running into strange issues that suggest that the version compatibilities are off. For context, I have Apache version 1.5.1, PySpark version 3.2.1, and Scala 2.12.18.

I installed the below packages using maven.

I'm trying to run this code

from sedona.spark import *

spark = SedonaContext.builder().\
    config('spark.jars.packages',
           'org.apache.sedona:sedona-spark-3.4_2.12:1.5.1,'
           'org.datasyslab:geotools-wrapper:1.5.1-28.2,'
           'uk.co.gresearch.spark:spark-extension_2.12:2.11.0-3.4,'
           'org.apache.sedona:sedona-python-adapter-3.0_2.12:1.4.1'). \
    config('spark.jars.repositories', 'https://artifacts.unidata.ucar.edu/repository/unidata-all'). \
    getOrCreate()

sedona = SedonaContext.create(spark)

according to their example notebook https://github.com/apache/sedona/blob/master/binder/ApacheSedonaSQL.ipynb, but also making sure to add in the Python Adapter.

But I get this error

Py4JJavaError: An error occurred while calling o206.showString.
: java.lang.NoSuchMethodError: 'double org.locationtech.jts.geom.Coordinate.getZ()'
    at org.apache.sedona.common.geometrySerde.GeometrySerializer.getCoordinateType(GeometrySerializer.java:449)
    at org.apache.sedona.common.geometrySerde.GeometrySerializer.serializePoint(GeometrySerializer.java:112)
    at org.apache.sedona.common.geometrySerde.GeometrySerializer.serialize(GeometrySerializer.java:43)
    at org.apache.sedona.sql.utils.GeometrySerializer$.serialize(GeometrySerializer.scala:36)
    at org.apache.spark.sql.sedona_sql.expressions.implicits$GeometryEnhancer.toGenericArrayData(implicits.scala:139)
    at org.apache.spark.sql.sedona_sql.expressions.InferredTypes$.$anonfun$buildSerializer$1(InferredExpression.scala:155)
    at org.apache.spark.sql.sedona_sql.expressions.InferredExpression.eval(InferredExpression.scala:71)
    at org.apache.spark.sql.catalyst.expressions.UnaryExpression.eval(Expression.scala:477)
    at org.apache.spark.sql.catalyst.optimizer.ConstantFolding$$anonfun$apply$1$$anonfun$applyOrElse$1.applyOrElse(expressions.scala:69)

which seems like the Geo tools are not working. I can load a regular dataframe though, just can't do geospatial operations on them. What's the issue here?

0

There are 0 best solutions below