How do I get the string representation of a UUID column?

104 Views Asked by At

I have a Cassandra table that contains a UUID field. creating a spark data frame gives the field as {__class__=uuid.UUID, int=809582560205543685759249226656473694} or something like that, using pyspark 3.4.0. any idea of how to get the string representation of that?

any syntax for a working UDF if necessary will be appreciated.

def udf_for_uuid(input_val):
    try:
       return uuid.UUID(input_val)
    except ValueError:
       return None

uid_udf = udf(lambda z: udf_for_uuid(z), StringType())
df = spark.createDataFrame(data, schema)
df.withColumn('msg_id_string', str_to_uuid_udf(df['msg_id']))

msg_id (original column) shows {__class__=uuid.UUID, int=809582560205543685759249226656473694} msg_id_string (added column) shows null

msg_id in cassandra is 007550ad-802f-11ed-a92a-0f3d2bcd625e

0

There are 0 best solutions below