I have an annotated Queryset in Django. It has about a hundred annotated fields. I don't know why, when the query is executed I get the error

django.db.utils.OperationalError: (1117, 'Too many columns')

The db is MySQL. The query (generated by Django) is about 50.000 characters long, but in summary it looks like:

SELECT DISTINCT

(((EXTRACT(YEAR FROM MIN(`table2`.`a`))*12)+EXTRACT(MONTH FROM MIN(`table2`.`a`)))-1) AS `month_a`,
(((EXTRACT(YEAR FROM MIN(`table2`.`b`))*12)+EXTRACT(MONTH FROM MIN(`table2`.`a`)))-1) AS `month_b`,
/* ... */
(((EXTRACT(YEAR FROM MIN(`table2`.`h`))*12)+EXTRACT(MONTH FROM MIN(`table2`.`a`)))-1) AS `month_h`,

/* a vast amount of other things (XXXX AS xx,). In this amount of things, each of the previous "((((EXTRACT...)" expressions is repeated about 50 times in total: we can't just reuse `month_x` as it makes use of grouping */

FROM `table1`
LEFT OUTER JOIN `table2` ON (`table1`.`id` = `table_2`.`ext_id`) /* this is needed again for other things */
/* about 15 other LEFT OUTER JOIN */

GROUP BY `table1`.`id` /* other groupings */
HAVING /* some clauses */
ORDER BY /* some criteria */

I spent two days trying to understand why this happens, with scarce results. Instead, in a moment of desperation I tried this:

SELECT DISTINCT
/* the same vast amount of other things but this time I can reuse the `month_x` as values */

FROM `table1`
LEFT OUTER JOIN
(SELECT
    (((EXTRACT(YEAR FROM MIN(`table2`.`a`))*12)+EXTRACT(MONTH FROM MIN(`table2`.`a`)))-1) AS `month_a`,
    (((EXTRACT(YEAR FROM MIN(`table2`.`b`))*12)+EXTRACT(MONTH FROM MIN(`table2`.`a`)))-1) AS `month_b`,
    /* ... */
    (((EXTRACT(YEAR FROM MIN(`table2`.`h`))*12)+EXTRACT(MONTH FROM MIN(`table2`.`a`)))-1) AS `month_h`,
    FROM `table1`
    LEFT OUTER JOIN `table2` ON (`table1`.`id` = `table_2`.`ext_id`)
) TA1 ON (`table1`.`id` = TA1.`ann_id`)
LEFT OUTER JOIN `table2` ON (`table1`.`id` = `table_2`.`ext_id`) /* it is needed again for other things */
/* about 15 othe LEFT OUTER JOIN (as before) */

GROUP BY `table1`.`id` /* other groupings (as before) */
HAVING /* some clauses (as before) */
ORDER BY /* some criteria (as before) */

It ran correctly and fast! From what I have read, this forces the evaluation of the month_xs in a subquery. Somewhere this kind of things is said to be inefficient, but the execution seems pretty fast to me and, as of now, is the only way I found to make it work.

My questions are:

  • what is the cause of the first error? What does it have to do with the number of columns?
  • Is it possible to write the equivalent of the second query in the Django ORM? Subqueries seem to be only capable of resuting in one column, I need more (8 in the example, 20 in reality). Or I could make 20 subqueries, all doing very similar oprations, and retrieve just one column from each, but this solution looks a bit consuming... isn't it? I would also accept suggestions on how to make the query work in a different way: the only things I want to avoid are raw queries, extras() and so on...

Something very similar has been asked two years ago in link but received no answer... Plus, I also ask for what is the connection between this and the "Toom many columns." error.

Thank you for reading this far.

0

There are 0 best solutions below