How to join 3 tables with group_concat

143 Views Asked by At

I am using Cubecart for a customer website, and they have used comma-separated values to associate certain option IDs to products. I need to create a custom table which gets all this information out into a format so that I can assign different product codes for varying combinations of product options.

Products Table

Product ID Product Code Assign Key
1 ABC 23,45
1 HIJ 23
1 KLM 45
2 DEF 10,28
2 GHI 10
2 NOP 28

Assign Table

Product ID Assign ID Value ID
1 23 1
1 45 2
2 10 3
2 28 4

Values Table

Value ID Value
1 Red
2 Large
3 Blue
4 Small

I can work out how I would connect the products table directly to the values table, if the assign keys were actually value IDs, but I can't work out how to do it with the assign table in the middle. I need to connect products to assign and assign to values.

    SELECT 
        t1.product_code, 
        t1.product_id,
        t1.assign_key,
        GROUP_CONCAT(t2.value_name)
    FROM products t1 
    LEFT JOIN values t2 ON FIND_IN_SET(t2.value_id, t1.assign_key)    
    GROUP BY t1.assign_key

Expected output:

Product ID Product Code Assign IDs Value IDs Values
1 ABC 23,45 1,2 Red, Large
1 HIJ 23 1 Red
1 KLM 45 2 Large
2 DEF 10,28 3,4 Blue, Small
2 GHI 10 3 Blue
2 NOP 28 4 Small

I tried adding another join in the middle, but cannot work out how to group concat twice.

I cannot just separate the assign keys so that there is one value per row, because the whole point is that the product code is only relevant to the combination of BOTH assign keys.

2

There are 2 best solutions below

1
Luuk On BEST ANSWER

This query produced the desired results:

SELECT 
  p.product_id,
  p.product_code,
  GROUP_CONCAT( SUBSTRING_INDEX(SUBSTRING_INDEX(assign_key,',',FIND_IN_SET(a.assign_id,p.assign_key)),',',-1)
          ORDER BY SUBSTRING_INDEX(SUBSTRING_INDEX(assign_key,',',FIND_IN_SET(a.assign_id,p.assign_key)),',',-1)) as AssignIDs,
  GROUP_CONCAT( a.value_id ORDER BY a.value_id) as ValueIDs,
  GROUP_CONCAT( v.value_name ORDER BY a.value_id) as "Values"
FROM products p
INNER JOIN assign a ON a.product_id = p.product_id
INNER JOIN value v ON v.value_id = a.value_id
WHERE FIND_IN_SET(a.assign_id,p.assign_key) > 0
GROUP BY p.product_id, p.product_code;

output:

product_id product_code AssignIDs ValueIDs Values
1 ABC 23,45 1,2 Red,Large
1 HIJ 23 1 Red
1 KLM 45 2 Large
2 DEF 10,28 3,4 Blue,Small
2 GHI 10 3 Blue
2 NOP 28 4 Small

A (short) plan in steps is done in the DBFIDDLE, but it is basically "keep adding stuff to this query, until we have all desired values", and then apply GROUP_CONCAT, and a proper GROUP BY

(An explanation about the error "ONLY_FULL_GROUP_BY", and how to solve it in your query, is in the DBFIDDLE too.)

Some notes:

  • SUBSTRING_INDEX(SUBSTRING_INDEX(assign_key,',',FIND_IN_SET(a.assign_id,p.assign_key)),',',-1) is done to find the nth element in a comma separated string.

    A short example: This will return d
    SELECT SUBSTRING_INDEX(SUBSTRING_INDEX('a,b,c,d,e',',',4),',',-1)

  • The ordering within the "Values" is done on ValueID, to match the description with the ID, so we have "Red,Large" (and not an alphabetic ordering of these values) because we have "1,2" and Red has the value "1".

1
GMB On

As an alternative to find_in_set, we could use JSON. A comma-separated list of integers can be easily turned to a JSON array by surrounding it with square brackets; then, we can use handy JSON function json_table to unnest the array to rows, while keeping track of the original position of each element with option with ordinality.

With the unnested list of keys at hand, it is now easy to bring tables assign and value with joins; and since we have the original index available, it is safe and simple to generate the CSV list of values.

We could express this with in a lateral join:

select p.*, x.*
from products p
cross join lateral (
    select 
        group_concat(a.value_id   order by j.idx) value_ids,
        group_concat(v.value_name order by j.idx) value_names
    from json_table(
        concat('[', p.assign_key, ']'),
        '$[*]' columns ( idx for ordinality, assign_id int path '$')
    ) j
    inner join assign a on a.assign_id = j.assign_id
    inner join value v on v.value_id  = a.value_id
) x

Or we could unnest and aggregate directly in the outer query:

select p.product_id, p.product_code, p.assign_key,
  group_concat(a.value_id   order by j.idx) value_ids,
  group_concat(v.value_name order by j.idx) value_names
from products p
cross join json_table(
    concat('[', p.assign_key, ']'),
    '$[*]' columns ( idx for ordinality, assign_id int path '$')
) j
inner join assign a on a.assign_id = j.assign_id
inner join value v on v.value_id  = a.value_id
group by p.product_id, p.product_code, p.assign_key

One upshot of this approach is that it would correctly handle duplicates in the CSV list, if any (whereas find_in_set-based solutions would usually drop one of the duplicates) - I added that case to product 1 in your sample data.

Also, it avoids repeated calls to find_in_set - but of course json_table comes with a price too, so whether this improves performance or not highly depends of your data and set-up.

Demo on DBFiddle

Product_ID Product_Code Assign_Key value_ids value_names
1 ABC 23,45,23 1,2,1 Red,Large,Red
1 HIJ 23 1 Red
1 KLM 45 2 Large
2 DEF 10,28 3,4 Blue,Small
2 GHI 10 3 Blue
2 NOP 28 4 Small