Snowflake allows to concatenate string across multiple rows by using aggregate/windowed LISTAGG function. Sometimes we just need first few occurrences per group. Similar to GROUP_CONCAT with LIMIT:
SELECT grp, COUNT(*) AS cnt,
GROUP_CONCAT(col ORDER BY col SEPARATOR ',' LIMIT 3) AS first_3
FROM tab
GROUP BY grp;
Example:
CREATE OR REPLACE TABLE tab(grp TEXT, col TEXT) AS
SELECT * FROM VALUES
('Grp1', 'A'),('Grp1', 'B'),('Grp1', 'C'),('Grp1', 'D'), ('Grp1', 'E'),
('Grp2', 'X'),('Grp2', 'Y'),('Grp2', 'Z'),('Grp2', 'V'),
('Grp3', 'M'),('Grp3', 'N'),('Grp3', 'O');
For provided sample input the expected result is:
| grp | cnt | first_3 |
|---|---|---|
| Grp1 | 5 | A,B,C |
| Grp2 | 4 | X,Y,Z |
| Grp3 | 3 | M,N,0 |
The idea is to find simple way of controlling number of rows considered for LISTAGG without affecting other aggregate functions and using subqueries like:
SELECT grp,
COUNT(*) AS cnt,
LISTAGG(CASE WHEN rn <= 3 THEN col END, ',') WITHIN GROUP(ORDER BY col) AS first_3
FROM (SELECT *, ROW_NUMBER() OVER(PARTITION BY grp ORDER BY col) AS rn
FROM tab)
GROUP BY grp;
The alternative to aggregate string concatenation is creation of array and transforming it to string afterwards. MIN_BY/MAX_BY allows to choose number of rows for returned array.
Output:
Here:
ARRAY_TO_STRINGandMIN_BYis effectively an equivalent ofLISTAGGwithLIMIT.It also works well for windowed scenario, if all rows are required:
If
LISTAGG(DISTINCT ...)is expected thenARRAY_DISTINCTcan be added: