Re: Further pg_upgrade analysis for many tables

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Magnus Hagander <magnus(at)hagander(dot)net>
Subject: Re: Further pg_upgrade analysis for many tables
Date: 2012-11-09 18:50:17
Message-ID: CAMkU=1yFfcfVDsX4qK0aOJocsMEDqE+=w=LQDzUvRmD4PNKGXQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Nov 8, 2012 at 9:50 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Jeff Janes <jeff(dot)janes(at)gmail(dot)com> writes:
>> Are sure the server you are dumping out of is head?
>
> I experimented a bit with dumping/restoring 16000 tables matching
> Bruce's test case (ie, one serial column apiece). The pg_dump profile
> seems fairly flat, without any easy optimization targets. But
> restoring the dump script shows a rather interesting backend profile:
>
> samples % image name symbol name
> 30861 39.6289 postgres AtEOXact_RelationCache
> 9911 12.7268 postgres hash_seq_search
...
>
> There are at least three ways we could whack that mole:
>
> * Run the psql script in --single-transaction mode, as I was mumbling
> about the other day. If we were doing AtEOXact_RelationCache only once,
> rather than once per CREATE TABLE statement, it wouldn't be a problem.
> Easy but has only a narrow scope of applicability.

That is effective when loading into 9.3 (assuming you make
max_locks_per_transaction large enough). But when loading into <9.3,
using --single-transaction will evoke the quadratic behavior in the
resource owner/lock table and make things worse rather than better.

But there is still the question of how people can start using 9.3 if
they can't use pg_upgrade, or use the pg_dump half of the dump/restore
in, order to get there.

It seems to me that pg_upgrade takes some pains to ensure that no one
else attaches to the database during its operation. In that case, is
it necessary to run the entire dump in a single transaction in order
to get a consistent picture? The attached crude patch allows pg_dump
to not use a single transaction (and thus not accumulate a huge number
of locks) by using the --pg_upgrade flag.

This seems to remove the quadratic behavior of running pg_dump against
pre-9.3 servers. It is linear up to 30,000 tables with a single
serial column, at about 1.5 msec per table.

I have no evidence other than a gut feeling that this is a safe thing to do.

I've also tested Tatsuo-san's group-"LOCK TABLE" patch against this
case, and it is minimal help. The problem is that there is no syntax
for locking sequences, so they cannot be explicitly locked as a group
but rather are implicitly locked one by one and so still suffer from
the quadratic behavior.

Cheers,

Jeff

Attachment Content-Type Size
pg_dump_for_upgrade.patch application/octet-stream 2.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2012-11-09 18:53:38 Re: Enabling Checksums
Previous Message Josh Berkus 2012-11-09 17:57:04 Re: Enabling Checksums