To what degree does PostgreSQL support parallel DDL?

1.5k Views Asked by At

Looking here, it is clear that Oracle supports execution of DDL commands in parallel with scenarios clearly listed. I was wondering whether Postgres does indeed offer such functionality? I can find a lot of material on "parallel queries" for PostgreSQL but not so much when DDL is involved.

For example, can I execute multiple 'CREATE TABLE...AS SELECT' in parallel? And if not, how can I achieve such functionality? What happens if I have a temporary table (CREATE TEMP TABLE)? Do I need to configure something for locks?

2

There are 2 best solutions below

2
Robert Harvey On

From here:

Even when it is in general possible for parallel query plans to be generated, the planner will not generate them for a given query if any of the following are true:

  • The query writes any data or locks any database rows. If a query contains a data-modifying operation either at the top level or within a CTE, no parallel plans for that query will be generated.

(emphasis mine).

Which seems to suggest that Postgres will not "parallelize" any query that modifies the database structure, under any circumstances.

Running multiple queries simultaneously in Postgres requires one connection per running query.

0
Panagiotis Kanavos On

Those are generic DDL statements, they are index operations and partition operations that can be parallelized.

If you check the Notes section of the CREATE INDEX statement, you'll see that parallel index building is supported :

PostgreSQL can build indexes while leveraging multiple CPUs in order to process the table rows faster. This feature is known as parallel index build. For index methods that support building indexes in parallel (currently, only B-tree), maintenance_work_mem specifies the maximum amount of memory that can be used by each index build operation as a whole, regardless of how many worker processes were started. Generally, a cost model automatically determines how many worker processes should be requested, if any.

Update

I suspect the real question is about CREATE TABLE ... AS though.

This is essentially a CREATE TABLE followed by an INSERT .. SELECT. The CREATE TABLE part can't be parallelized and doesn't have to - it's essentially a metadata operation. The SELECT on the other hand, could be parallelized easily. INSERT is a bit harder, but it's a matter of implementation.

As a_horse_with_no_name explains in a comment to this question, parallelization for CREATE TABLE AS was added in PostgreSQL 11 :

Improvements to parallelism, including:

  • CREATE INDEX can now use parallel processing while building a B-tree index
  • Parallelization is now possible in CREATE TABLE ... AS, CREATE MATERIALIZED VIEW, and certain queries using UNION
  • Parallelized hash joins and parallelized sequential scans now perform better