I've set up Postgres 9.6 and checked on a large table of random integers that parallel queries are working. However, a simple XPath query on an XML column of another table is always sequential. Both XPath functions are marked as parallel safe in Postgres. I tried to alter XPath cost, so the expected cost skyrocketed, but it didn't change anything. What am I missing?
Example table DDL:
CREATE TABLE "test_table" ("xml" XML );
Example query:
SELECT xpath('/a', "xml") FROM "test_table";
Example data:
<a></a>.
Note that real data contains XMLs that are 10-1000kB in size.
> select pg_size_pretty(pg_total_relation_size('test_table'));
28 MB
> explain (analyze, verbose, buffers) select xpath('/a', "xml") from test_table;
Seq Scan on public.test_table (cost=0.00..64042.60 rows=2560 width=32) (actual time=1.420..4527.061 rows=2560 loops=1)
Output: xpath('/a'::text, xml, '{}'::text[])
Buffers: shared hit=10588
Planning time: 0.058 ms
Execution time: 4529.503 ms
The relevant point here is likely the distinction between "relation size" and "total relation size":
Large column values like these are not stored within the main relation, but instead are pushed to its associated TOAST table. This external storage does not count towards
pg_relation_size(), which is what the optimiser appears to be comparing againstmin_parallel_relation_sizewhen evaluating a parallel plan: