I have an annotated Queryset in Django. It has about a hundred annotated fields. I don't know why, when the query is executed I get the error
django.db.utils.OperationalError: (1117, 'Too many columns')
The db is MySQL. The query (generated by Django) is about 50.000 characters long, but in summary it looks like:
SELECT DISTINCT
(((EXTRACT(YEAR FROM MIN(`table2`.`a`))*12)+EXTRACT(MONTH FROM MIN(`table2`.`a`)))-1) AS `month_a`,
(((EXTRACT(YEAR FROM MIN(`table2`.`b`))*12)+EXTRACT(MONTH FROM MIN(`table2`.`a`)))-1) AS `month_b`,
/* ... */
(((EXTRACT(YEAR FROM MIN(`table2`.`h`))*12)+EXTRACT(MONTH FROM MIN(`table2`.`a`)))-1) AS `month_h`,
/* a vast amount of other things (XXXX AS xx,). In this amount of things, each of the previous "((((EXTRACT...)" expressions is repeated about 50 times in total: we can't just reuse `month_x` as it makes use of grouping */
FROM `table1`
LEFT OUTER JOIN `table2` ON (`table1`.`id` = `table_2`.`ext_id`) /* this is needed again for other things */
/* about 15 other LEFT OUTER JOIN */
GROUP BY `table1`.`id` /* other groupings */
HAVING /* some clauses */
ORDER BY /* some criteria */
I spent two days trying to understand why this happens, with scarce results. Instead, in a moment of desperation I tried this:
SELECT DISTINCT
/* the same vast amount of other things but this time I can reuse the `month_x` as values */
FROM `table1`
LEFT OUTER JOIN
(SELECT
(((EXTRACT(YEAR FROM MIN(`table2`.`a`))*12)+EXTRACT(MONTH FROM MIN(`table2`.`a`)))-1) AS `month_a`,
(((EXTRACT(YEAR FROM MIN(`table2`.`b`))*12)+EXTRACT(MONTH FROM MIN(`table2`.`a`)))-1) AS `month_b`,
/* ... */
(((EXTRACT(YEAR FROM MIN(`table2`.`h`))*12)+EXTRACT(MONTH FROM MIN(`table2`.`a`)))-1) AS `month_h`,
FROM `table1`
LEFT OUTER JOIN `table2` ON (`table1`.`id` = `table_2`.`ext_id`)
) TA1 ON (`table1`.`id` = TA1.`ann_id`)
LEFT OUTER JOIN `table2` ON (`table1`.`id` = `table_2`.`ext_id`) /* it is needed again for other things */
/* about 15 othe LEFT OUTER JOIN (as before) */
GROUP BY `table1`.`id` /* other groupings (as before) */
HAVING /* some clauses (as before) */
ORDER BY /* some criteria (as before) */
It ran correctly and fast!
From what I have read, this forces the evaluation of the month_xs in a subquery. Somewhere this kind of things is said to be inefficient, but the execution seems pretty fast to me and, as of now, is the only way I found to make it work.
My questions are:
- what is the cause of the first error? What does it have to do with the number of columns?
- Is it possible to write the equivalent of the second query in the Django ORM? Subqueries seem to be only capable of resuting in one column, I need more (8 in the example, 20 in reality). Or I could make 20 subqueries, all doing very similar oprations, and retrieve just one column from each, but this solution looks a bit consuming... isn't it? I would also accept suggestions on how to make the query work in a different way: the only things I want to avoid are raw queries, extras() and so on...
Something very similar has been asked two years ago in link but received no answer... Plus, I also ask for what is the connection between this and the "Toom many columns." error.
Thank you for reading this far.