sql command containing CTE executes fine %sql type of command however fails with spark.sql in Databricks

32 Views Asked by At

I have a sql command in Databricks, that executes fine with %sql but when I use spark.sql so that python code can be utilized to populate the table names it returns following error.

The spark driver has stopped unexpectedly and is restarting. Your notebook will be automatically reattached.

I am using CTE expression in databricks.

CTE is as follows:

with
    a_CTE as (
        SELECT DISTINCT
            colA, 'abc' ColB
        from " + tableA + "
    ),
    b_CTE (colC) as (
        select distinct
            ColC
        from " + tableB + "
    )
select *
from a_CTE , b_CTE 

Unfortunately, I will have to go back to %sql and hard code the table names there. I could not find other alternative.

1

There are 1 best solutions below

0
rainingdistros On

Ideally it should work... Since code was not provided I assume it must have been some typo issue...

I tried below query and it seemed to work.

qry = ("with "
       f"a_CTE as (SELECT DISTINCT ColA, 'abc' ColB from {tableA}),"
       f"b_CTE (colC) as (select distinct ColC from {tableB})"
       f"select * from a_CTE , b_CTE")

Basically just converted the table name variables as an f-string implementation.

Also please consider that within sql blocks, after DBX runtime 14.1, you have the option to use variables. Widgets is another option.

Please try and let us know. Please provide code and error message in case this does not help...

Cheers,