How to convert pyspark df to python string?

66 Views Asked by At

I just want to show you the way "How to convert pyspark df to python string?". Task: get a python string from a pyspark dataframe.

If you know how to make this easier, let me know!)

  1. I got df from spark.sql()
sql = f"""
SELECT max(calculation_dt) max_calc FROM default.table
"""

max_calc_dt = spark.sql(sql)

It returned only one row: +----------+ | max_calc| +----------+ |2023-07-31| +----------+

  1. Change type of col max_calc from date to str:
a = max_calc_dt.select(col('max_calc').cast('string')) #change type of col max_calc from date to str

  1. Convert df to rdd:
a_rdd = a.rdd.collect()
  1. Get python str: There are two ways: use cycle "for" or get values using index. I'll show you the second option:
res = a_rdd[0]['max_calc'].strip()
print(res)
0

There are 0 best solutions below