Re: Further pg_upgrade analysis for many tables

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Ants Aasma <ants(at)cybertec(dot)at>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Magnus Hagander <magnus(at)hagander(dot)net>, Andrew Dunstan <andrew(at)dunslane(dot)net>
Subject: Re: Further pg_upgrade analysis for many tables
Date: 2012-11-28 04:13:04
Message-ID: 20121128041304.GB1820@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 26, 2012 at 05:26:42PM -0500, Bruce Momjian wrote:
> I have developed the attached proof-of-concept patch to test this idea.
> Unfortunately, I got poor results:
>
> ---- pg_upgrade ----
> dump restore dmp|res git dmp/res
> 1 0.12 0.07 0.13 11.16 13.03
> 1000 3.80 2.83 5.46 18.78 20.27
> 2000 5.39 5.65 13.99 26.78 28.54
> 4000 16.08 12.40 28.34 41.90 44.03
> 8000 32.77 25.70 57.97 78.61 80.09
> 16000 57.67 63.42 134.43 158.49 165.78
> 32000 131.84 176.27 302.85 380.11 389.48
> 64000 270.37 708.30 1004.39 1085.39 1094.70
>
> The last two columns show the patch didn't help at all, and the third
> column shows it is just executing the pg_dump, then the restore, not in
> parallel, i.e. column 1 + column 2 ~= column 3.
...
> I will now test using PRIMARY KEY and custom dump format with pg_restore
> --jobs to see if I can get parallelism that way.

I have some new interesting results (in seconds, test script attached):

---- -Fc ---- ------- dump | pg_restore/psql ------ - pg_upgrade -
dump restore -Fc -Fc|-1 -Fc|-j -Fp -Fp|-1 git patch
1 0.14 0.08 0.14 0.16 0.19 0.13 0.15 11.04 13.07
1000 3.08 3.65 6.53 6.60 5.39 6.37 6.54 21.05 22.18
2000 6.06 6.52 12.15 11.78 10.52 12.89 12.11 31.93 31.65
4000 11.07 14.68 25.12 24.47 22.07 26.77 26.77 56.03 47.03
8000 20.85 32.03 53.68 45.23 45.10 59.20 51.33 104.99 85.19
16000 40.28 88.36 127.63 96.65 106.33 136.68 106.64 221.82 157.36
32000 93.78 274.99 368.54 211.30 294.76 376.36 229.80 544.73 321.19
64000 197.79 1109.22 1336.83 577.83 1117.55 1327.98 567.84 1766.12 763.02

I tested custom format with pg_restore -j and -1, as well as text
restore. The winner was pg_dump -Fc | pg_restore -1; even -j could not
beat it. (FYI, Andrew Dunstan told me that indexes can be restored in
parallel with -j.) That is actually helpful because we can use process
parallelism to restore multiple databases at the same time without
having to use processes for -j parallelism.

Attached is my pg_upgrade patch for this. I am going to polish it up
for 9.3 application.

> A further parallelism would be to allow multiple database to be
> dump/restored at the same time. I will test for that once this is done.

I will work on this next.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

Attachment Content-Type Size
test_many_tables text/plain 3.1 KB
pg_upgrade.diff text/x-diff 9.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2012-11-28 04:27:28 Re: PITR potentially broken in 9.2
Previous Message Kyotaro HORIGUCHI 2012-11-28 02:11:39 Re: the number of pending entries in GIN index with FASTUPDATE=on