Im using databricks notebook to process some large files, providing example code that used in notebook. This code run perfectly without fail when i used my cluster Access mode was not Shared. But now we are going for Shared access mode, this is giving me some errors. runtime version is 13.3LTS with spark 3.4.1
data = [("xx",cc,"sdf"),("abcd",xx,"jkhj")]
cols = ["value","x1","x2"]
df = spark.createDataFrame(data,schema="value STRING,x1 STRING,x2 STRING")
MY_udf= udf(xxx, StringType())
updated_df = df.withColumn("updated_value", MY_udf(df["VALUE"]))
display(updated_df)
The error i got is SparkRuntimeException: [UDF_ERROR.PAYLOAD] Execution of function failed XXX - failed to set payload
INVALID_ARGUMENT: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.33' not found (required by /local_disk0/.ephemeral_nfs/envs/pythonEnv-1860d5f1-5b01-457c-97c7-e5955a17ca8e/lib/python3.10/site-packages/xxx.cpython-310-x86_64-linux-gnu.so)