How to read and execute hql file(hive query) and create pyspark dataframe

48 Views Asked by At

I have a .hql file. I need to read and execute the query to create a dataframe from the query result. I have the below code

def read_and_exec_hql(hql_file_path):
    with open(hql_file_path, 'r') as f:
        hql_query = f.read().strip()
    queries = [q.strip() for q in hql_query.splitlines() if q.strip() and not q.startswith('--')]
    df = None
    for query in queries:
        if df is None:
            df = spark.sql(query)
        else:
            df = df.union(spark.sql(query))
    return df
    
hql_file_path = 'path/to/hql/file'
df = read_and_exec_hql(hql_file_path)
df.show()

I'm getting Py4JJavaError for this.

Is there any other approach to read and execute hql files in pyspark. Please let me know. Thanks in advance

0

There are 0 best solutions below