Error : " An error occurred while calling o132.load. : java.lang.NoClassDefFoundError: org/apache/hadoop/fs/StreamCapabilities" My Spark version is : 3.3.1 Hadoop : Hadoop 3.1.1.3.1.5.0-152
Spark command : spark-submit --master yarn --deploy-mode client --jars /spark3.3.1/jars/iceberg-spark-runtime-3.3_2.12-1.3.0.jar,/spark3.3.1/jars/hadoop-aws-3.1.1.jar,/app/spark3.3.1/jars/aws-java-sdk-bundle-1.11.271.jar
Setting this in Spark Session : ("spark.jars.packages","org.apache.hadoop:hadoop-aws:3.1.1")
filesystem = spark._jvm.org.apache.hadoop.fs.FileSystem
path = spark._jvm.org.apache.hadoop.fs.Path
fs = filesystem.get(spark._jsc.hadoopConfiguration())
spark.conf.set("fs.s3a.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem")
spark.conf.set("fs.s3a.access.key", auth.get('AWS_ACCESS_KEY_ID'))
spark.conf.set("fs.s3a.secret.key", auth.get('AWS_SECRET_ACCESS_KEY'))
df = (spark.readStream.schema(new_schema)
.format(file_type)
.option("header", file_header)
.option("maxFilesPerTrigger", 1)
.option("maxFilesPerBatch", 1)
.option("timestampFormat", timestamp_format)
.load(my_path)
)