Re: Inefficiency in parallel pg_restore with many tables

From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Inefficiency in parallel pg_restore with many tables
Date: 2023-07-22 23:19:41
Message-ID: 20230722231941.GA2020225@nathanxps13
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jul 20, 2023 at 12:06:44PM -0700, Nathan Bossart wrote:
> Here is a work-in-progress patch set for converting ready_list to a
> priority queue. On my machine, Tom's 100k-table example [0] takes 11.5
> minutes without these patches and 1.5 minutes with them.
>
> One item that requires more thought is binaryheap's use of Datum. AFAICT
> the Datum definitions live in postgres.h and aren't available to frontend
> code. I think we'll either need to move the Datum definitions to c.h or to
> adjust binaryheap to use "void *".

In v3, I moved the Datum definitions to c.h. I first tried modifying
binaryheap to use "int" or "void *" instead, but that ended up requiring
some rather invasive changes in backend code, not to mention any extensions
that happen to be using it. I also looked into moving the definitions to a
separate datumdefs.h header that postgres.h would include, but that felt
awkward because 1) postgres.h clearly states that it is intended for things
"that never escape the backend" and 2) the definitions seem relatively
inexpensive. However, I think the latter option is still viable, so I'm
fine with switching to it if folks think that is a better approach.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v3-0001-move-datum-definitions-to-c.h.patch text/x-diff 22.1 KB
v3-0002-make-binaryheap-available-to-frontend.patch text/x-diff 3.3 KB
v3-0003-expand-binaryheap-api.patch text/x-diff 2.1 KB
v3-0004-use-priority-queue-for-pg_restore-ready_list.patch text/x-diff 14.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2023-07-22 23:28:15 Re: Inefficiency in parallel pg_restore with many tables
Previous Message Jeff Davis 2023-07-22 18:52:15 Re: Fix search_path for all maintenance commands