Re: Allow parallel plan for referential integrity checks?

From: Frédéric Yhuel <frederic(dot)yhuel(at)dalibo(dot)com>
To: Juan José Santamaría Flecha <juanjo(dot)santamaria(at)gmail(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se>
Cc: "Gregory Stark (as CFM)" <stark(dot)cfm(at)gmail(dot)com>, Ian Lawrence Barwick <barwick(at)gmail(dot)com>, "Imseih (AWS), Sami" <simseih(at)amazon(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Jacob Champion <jchampion(at)timescale(dot)com>, robertmhaas(at)gmail(dot)com
Subject: Re: Allow parallel plan for referential integrity checks?
Date: 2023-08-17 12:00:59
Message-ID: 37eb5e9a-a094-b8f0-852f-56ac6ed96fac@dalibo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 8/17/23 09:32, Frédéric Yhuel wrote:
>
>
> On 8/10/23 17:06, Juan José Santamaría Flecha wrote:
>> Recently I restored a database from a directory format backup and
>> having this feature would have been quite useful
>
> Hi,
>
> Thanks for resuming work on this patch. I forgot to mention this in my
> original email, but the motivation was also to speed up the restore
> process. Parallelizing the FK checks could make a huge difference in
> certain cases. We should probably provide such a test case (with perf
> numbers), and maybe this is it what Robert asked for.

I have attached two scripts which demonstrate the following problems:

1a. if the tables aren't analyzed nor vacuumed before the post-data
step, then they are index-only scanned, with a lot of heap fetches
(depending on their size, the planner sometimes chooses a seq scan instead).

1b. if the tables have been analyzed but not vacuumed before the
post-data-step, then they are scanned sequentially. Usually better, but
still not so good without a parallel plan.

2. if the visibility maps have been created, then the tables are
index-only scanned without heap fetches, but this can still be slower
than a parallel seq scan.

So it would be nice if pg_restore could vacuum analyze the tables before
the post-data step. I believe it would be faster in most cases.

And it would be nice to allow a parallel plan for RI checks.

Best regards,
Frédéric

Attachment Content-Type Size
create_and_fill_tables.sql application/sql 1.0 KB
post_data.sql application/sql 530 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Antonin Houska 2023-08-17 12:25:40 Re: walsender "wakeup storm" on PG16, likely because of bc971f4025c (Optimize walsender wake up logic using condition variables)
Previous Message Nazir Bilal Yavuz 2023-08-17 11:18:58 Re: meson: Non-feature feature options