spark sql encoder for immutable data type

297 Views Asked by drobert At 20 October 2021 at 15:15

I've generally used immutable value types when writing java code. Sometimes it's been through libraries (Immutables, AutoValue, Lombok), but mostly just vanilla java classes with:

all final fields
a constructor with all fields as parameters

(This question is for java 11 and below, given current spark support).

In Spark Sql, data types require an Encoder. Using off-the-shelf encoders like Encoder.bean(MyType.class), using such an immutable data type results in "illegal reflective access operation".

I'm curious what the spark sql (dataset) approach is here. Obviously I could relax this and make it a mutable pojo.

Update

Looking into the code for Encoders.bean it really does have to be a classic, mutable POJO. The reflection code looks for appropriate setters. Further (and this is documented) the only supported collection types are array, list and map (not set).

Original Q&A

There are 1 best solutions below

drobert On 20 October 2021 at 15:31 BEST ANSWER

This was actually a misdiagnosis. The immutability of my data type was not causing the reflective access issues. It was a JVM 11+ issue (mostly noted here) https://github.com/renaissance-benchmarks/renaissance/issues/241

By adding the following JVM arguments everything is working correctly:

--illegal-access=deny --add-opens java.base/java.nio=ALL-UNNAMED --add-opens java.base/sun.nio.ch=ALL-UNNAMED

spark sql encoder for immutable data type

There are 1 best solutions below

Related Questions in JAVA

Related Questions in APACHE-SPARK

Related Questions in APACHE-SPARK-SQL

Related Questions in APACHE-SPARK-DATASET

Related Questions in APACHE-SPARK-ENCODERS

Trending Questions

Popular # Hahtags

Popular Questions