Re: Further pg_upgrade analysis for many tables

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Peter Eisentraut <peter(at)eisentraut(dot)org>
Cc: PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>, Magnus Hagander <magnus(at)hagander(dot)net>
Subject: Re: Further pg_upgrade analysis for many tables
Date: 2012-11-09 03:25:23
Message-ID: 20121109032523.GB26605@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Nov 8, 2012 at 12:30:11PM -0500, Peter Eisentraut wrote:
> On 11/7/12 9:17 PM, Bruce Momjian wrote:
> > As a followup to Magnus's report that pg_upgrade was slow for many
> > tables, I did some more testing with many tables, e.g.:
> >
> > CREATE TABLE test991 (x SERIAL);
> >
> > I ran it for 0, 1k, 2k, ... 16k tables, and got these results:
> >
> > tables pg_dump restore pg_upgrade(increase)
> > 0 0.30 0.24 11.73(-)
> > 1000 6.46 6.55 28.79(2.45x)
> > 2000 29.82 20.96 69.75(2.42x)
> > 4000 95.70 115.88 289.82(4.16x)
> > 8000 405.38 505.93 1168.60(4.03x)
> > 16000 1702.23 2197.56 5022.82(4.30x)
>
> I can reproduce these numbers, more or less. (Additionally, it ran out
> of shared memory with the default setting when dumping the 8000 tables.)
>
> But this issue seems to be entirely the fault of sequences being
> present. When I replace the serial column with an int, everything
> finishes within seconds and scales seemingly linearly.

I did some more research and realized that I was not using --schema-only
like pg_upgrade uses. With that setting, things look like this:

--schema-only
tables pg_dump restore pg_upgrade
1 0.27 0.23 11.73(-)
1000 3.64 5.18 28.79(2.45)
2000 13.07 14.63 69.75(2.42)
4000 43.93 66.87 289.82(4.16)
8000 190.63 326.67 1168.60(4.03)
16000 757.80 1402.82 5022.82(4.30)

You can still see the 4x increase, but it now for all tests ---
basically, every time the number of tables doubles, the time to dump or
restore a _single_ table doubles, e.g. for 1k tables, a single table
takes 0.00364 to dump, for 16k tables, a single table takes 0.04736 to
dump, a 13x slowdown.

Second, with --schema-only, you can see the dump/restore is only 50% of
the duration of pg_upgrade, and you can also see that pg_upgrade itself
is slowing down as the number of tables increases, even ignoring the
dump/reload time.

This is all bad news. :-( I will keep digging.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Janes 2012-11-09 04:59:21 Re: Further pg_upgrade analysis for many tables
Previous Message Alvaro Herrera 2012-11-09 02:33:36 Re: Enabling Checksums