Are there conventional UDF in Spark 3.5.0 not using arrow?

44 Views Asked by At

I have used Pandas UDF and I am aware they use pyarrow to avoid object translation between JVM and the Python interpreter running in the workers. I am also aware of the property spark.sql.execution.arrow.enabled that can be set to true in order to optimize interactions between Pandas DF and Spark DF. My questions are:

  • Are there still conventional UDFs not affected by pyarrow in pyspark 3.5.0?
  • Is it necessary to set spark.sql.execution.arrow.enabled to true in order to use Pandas UDF and avoid converting JVM objects <-> Python objects?
  • When the property is enabled, does it mean that even with conventional UDFs, there is no conversion between Java and Python? Or the improvement is only for Pandas UDF specifically, but not for general UDFs? (this is basically a re-word of my first question)
0

There are 0 best solutions below