Quick Links

Re: Avoid useless retrieval of defaults and check constraints in pg_dump -a

From:	Julien Rouhaud <rjuju123(at)gmail(dot)com>
To:	Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc:	PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: Avoid useless retrieval of defaults and check constraints in pg_dump -a
Date:	2020-07-14 09:14:50
Message-ID:	CAOBaU_beGQEv6nJCQ-cDnHeC6hew-WaLjFyf-Snz_-rzATyGtQ@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Sun, Jul 12, 2020 at 4:29 PM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
>
> On Sun, Jul 12, 2020 at 07:48:50AM +0200, Julien Rouhaud wrote:
> > Currently, getTableAttrs() always retrieves info about columns defaults and
> > check constraints, while this will never be used if --data-only option if used.
> > This seems like a waste of resources, so here's a patch to skip those parts
> > when the DDL won't be generated.
>
> Note that the speed of default and constraint handling has come up before:
> https://www.postgresql.org/message-id/flat/CAMkU%3D1xPqHP%3D7YPeChq6n1v_qd4WGf%2BZvtnR-b%2BgyzFqtJqMMQ%40mail.gmail.com
> https://www.postgresql.org/message-id/CAMkU=1x-e+maqefhM1yMeSiJ8J9Z+SJHgW7c9bqo3E3JMG4iJA@mail.gmail.com

Oh, I wasn't aware of that.

> I'd pointed out that a significant fraction of our pg_upgrade time was in
> pg_dump, due to having wide tables with many child tables, and "default 0" on
> every column. (I've since dropped our defaults so this is no longer an issue
> here).
>
> It appears your patch would avoid doing unnecessary work in the --data-only
> case, but it wouldn't help the pg_upgrade case.

Indeed. Making the schema part faster would probably require a bigger
refactoring. I'm wondering if we could introduce some facility to
temporarily deny any DDL change, so that the initial pg_dump -s done
by pg_upgrade can be performed before shutting down the instance.

Note that those extraneous queries were found while trying to dump
data out of a corrupted database. The issue wasn't an excessive
runtime but corrupted catalog entries, so bypassing this code (since I
was only interested in the data anyway) was simpler than trying to
recover yet other corrupted rows.

In response to

Re: Avoid useless retrieval of defaults and check constraints in pg_dump -a at 2020-07-12 14:29:21 from Justin Pryzby

Responses

Re: Avoid useless retrieval of defaults and check constraints in pg_dump -a at 2020-09-15 02:33:20 from Michael Paquier

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Petr Jelinek	2020-07-14 09:17:43	Re: replication_origin and replication_origin_lsn usage on subscriber
Previous Message	Julien Rouhaud	2020-07-14 09:08:08	Re: Online checksums verification in the backend