Quick Links

Re: Inefficiency in parallel pg_restore with many tables

From:	Pierre Ducroquet <p(dot)psql(at)pinaraf(dot)info>
To:	pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject:	Re: Inefficiency in parallel pg_restore with many tables
Date:	2023-07-24 17:27:36
Message-ID:	1903495.6tgchFWduM@peanuts2
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Saturday, July 15, 2023 7:47:12 PM CEST Tom Lane wrote:
> I'm not sure how big a deal this is in practice: in most situations
> the individual jobs are larger than they are in this toy example,
> plus the initial non-parallelizable part of the restore is a bigger
> bottleneck anyway with this many tables. Still, we do have one
> real-world complaint, so maybe we should look into improving it.

For what it's worth, at my current job it's kind of a big deal. I was going to
start looking at the bad performance I got on pg_restore for some databases
with over 50k tables (in 200 namespaces) when I found this thread. The dump
weights in about 2,8GB, the toc.dat file is 230MB, 50 120 tables, 142 069
constraints and 73 669 indexes.

HEAD pg_restore duration: 30 minutes
pg_restore with latest patch from Nathan Bossart: 23 minutes

This is indeed better, but there is still a lot of room for improvements. With
such usecases, I was able to go much faster using the patched pg_restore with
a script that parallelize on each schema instead of relying on the choices
made by pg_restore. It seems the choice of parallelizing only the data loading
is losing nice speedup opportunities with a huge number of objects.

patched pg_restore + parallel restore of schemas: 10 minutes

Anyway, the patch works really fine as is, and I will certainly keep trying
future iterations.

Regards

Pierre

In response to

Inefficiency in parallel pg_restore with many tables at 2023-07-15 17:47:12 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Jeff Davis	2023-07-24 17:36:25	Re: Use of additional index columns in rows filtering
Previous Message	Vik Fearing	2023-07-24 17:19:48	Re: cataloguing NOT NULL constraints