Re: Pg 16: will pg_dump & pg_restore be faster?

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Ron <ronljohnsonjr(at)gmail(dot)com>, pgsql-general(at)lists(dot)postgresql(dot)org, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Pg 16: will pg_dump & pg_restore be faster?
Date: 2023-05-31 01:13:08
Message-ID: ZHafJF3FadgFr/5A@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, May 31, 2023 at 09:14:20AM +1200, David Rowley wrote:
> On Wed, 31 May 2023 at 08:54, Ron <ronljohnsonjr(at)gmail(dot)com> wrote:
> > https://www.postgresql.org/about/news/postgresql-16-beta-1-released-2643/
> > says "PostgreSQL 16 can also improve the performance of concurrent bulk
> > loading of data using COPY up to 300%."
> >
> > Since pg_dump & pg_restore use COPY (or something very similar), will the
> > speed increase translate to higher speeds for those utilities?
>
> I think the improvements to relation extension only help when multiple
> backends need to extend the relation at the same time. pg_restore can
> have multiple workers, but the tasks that each worker performs are
> only divided as far as an entire table, i.e. 2 workers will never be
> working on the same table at the same time. So there is no concurrency
> in terms of 2 or more workers working on loading data into the same
> table at the same time.
>
> It might be an interesting project now that we have TidRange scans, to
> have pg_dump split larger tables into chunks so that they can be
> restored in parallel.

Uh, the release notes say:

<!--
Author: Andres Freund <andres(at)anarazel(dot)de>
2023-04-06 [00d1e02be] hio: Use ExtendBufferedRelBy() to extend tables more eff
Author: Andres Freund <andres(at)anarazel(dot)de>
2023-04-06 [26158b852] Use ExtendBufferedRelTo() in XLogReadBufferExtended()
-->

<listitem>
<para>
Allow more efficient addition of heap and index pages (Andres Freund)
</para>
</listitem>

There is no mention of concurrency being a requirement. Is it wrong? I
think there was a question of whether you had to add _multiple_ blocks
ot get a benefit, not if concurrency was needed. This email about the
release notes didn't mention the concurrent requirement:

https://www.postgresql.org/message-id/20230521171341.jjxykfsefsek4kzj%40awork3.anarazel.de

While the case of extending by multiple pages improved the most, even
extending by a single page at a time got a good bit more scalable. Maybe
just "Improve efficiency of extending relations"?

--
Bruce Momjian <bruce(at)momjian(dot)us> https://momjian.us
EDB https://enterprisedb.com

Only you can decide what is important to you.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message David Rowley 2023-05-31 02:05:10 Re: Pg 16: will pg_dump & pg_restore be faster?
Previous Message Bruce Momjian 2023-05-31 01:04:48 Re: How to make the generate_series to generate the letter series?