I've wrote some experimental code that dynamically unpivots that table in Snowflake by re-casting all columns to VARCHAR and creating a list of columns using the LISTAGG function. The procedure works; however, I would like to ask if it's possible to improve the syntax by:
- Reducing the dependency on string manipulation inside the table
- Is it possible to replace the ugly
REGEXP_REPLACEcall with a more elegant loop-based solution that would recast all required columns?
/* Dynamically Unpivot Table
*
* The following procedure can be used to dynamically unpivot table without specifying column names.
* The code, will re-cast the columns into the varchar type and return the table with column name and
* column value.
*/
-- Procedure definition
CREATE OR REPLACE PROCEDURE dynamic_unpivot(table_name TEXT)
RETURNS TABLE(col_name STRING, col_val STRING)
LANGUAGE SQL
COMMENT = 'Dynamically unpivot table'
AS
DECLARE
res RESULTSET;
pivot_cols TEXT;
pivot_cols_recast TEXT;
recast_select_statement TEXT;
select_statement TEXT;
BEGIN
-- Capture columns to pivot through
SHOW COLUMNS IN VIEW IDENTIFIER(:table_name);
pivot_cols := (SELECT LISTAGG(DISTINCT "column_name", ', ') FROM TABLE(RESULT_SCAN(LAST_QUERY_ID())));
pivot_cols_recast := (REGEXP_REPLACE(pivot_cols, '(\\b\\w+\\b)', '\\1::VARCHAR AS \\1'));
recast_select_statement := 'SELECT ' || :pivot_cols_recast || ' FROM ' || :table_name ;
select_statement := 'SELECT * FROM (' || :recast_select_statement || ' ) t_recast' ||
' UNPIVOT (col_val ' || ' FOR col_name ' || ' IN (' || :pivot_cols || '))';
res := (EXECUTE IMMEDIATE :select_statement);
RETURN TABLE(res);
END;
-- Tests
-- Create temporary table
CREATE OR REPLACE TEMPORARY TABLE tmp_t1 (col_a VARCHAR, col_b NUMBER);
-- Insert some sample data
INSERT INTO tmp_t1 (col_a, col_b) VALUES
('one', 1),
('two', 2),
('three', 3),
('four', 4);
-- Test query running procedure
CALL dynamic_unpivot('tmp_t1');
SELECT * FROM TABLE(RESULT_SCAN(LAST_QUERY_ID())) ORDER by col_name;
-- Cleaning
DROP PROCEDURE IF EXISTS dynamic_unpivot(TEXT);
DROP TABLE IF EXISTS tmp_t1;
Instead of using unpivot for this, you can use snowflake semi-structured data handling.
We can use
object_constuct_keep_null(*)to gather all the columns in each row of the table into object representing the row. Then uselateral flattento perform the 'UNPIVOT' operation. In my experience this is often performs faster than UNPIVOT as well but YMMV.The SQL looks like this:
If you still want this enclosed within a Stored-Procedure call it would look like this:
An alternative approach to keep the tablename dynamic, whilst returning a tabular output that you can use (i.e. instead of using
TABLE(RESULT_SCAN(LAST_QUERY_ID())), is to use a session variable andidentifier()to enclose it in the query body.You can even turn this into a 'dynamic' view that you can query from, provided you have set the session variable to something before the view creation.
With this you can set table_name to any view/table that you want to unpivot and get the desired result. e.g.
NOTE: If you try and query the view without the table_name variable set you will get an error.
Session variable '$TABLE_NAME' does not exist