Quick Links

Re: Further pg_upgrade analysis for many tables

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	Ants Aasma <ants(at)cybertec(dot)at>
Cc:	Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Magnus Hagander <magnus(at)hagander(dot)net>
Subject:	Re: Further pg_upgrade analysis for many tables
Date:	2012-11-12 20:59:27
Message-ID:	20121112205927.GD14488@momjian.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Mon, Nov 12, 2012 at 12:09:08PM -0500, Bruce Momjian wrote:
> OK, I have had some time to think about this. What the current code
> does is, for each database, get a directory listing to know about any
> vm, fsm, and >1gig extents that exist in the directory. It caches the
> directory listing and does full array scans looking for matches. If the
> tablespace changes, it creates a new directory cache and throws away the
> old one. This code certainly needs improvement!
>
> I can think of two solutions. The first would be to scan the database
> directory, and any tablespaces used by the database, sort it, then allow
> binary search of the directory listing looking for file prefixes that
> match the current relation.
>
> The second approach would be to simply try to copy the fsm, vm, and
> extent files, and ignore any ENOEXIST errors. This allows code
> simplification. The downside is that it doesn't pull all files with
> matching prefixes --- it requires pg_upgrade to _know_ what suffixes
> might exist in that directory. Second, it assumes there can be no
> number gaps in the file extent numbering (is that safe?).
>
> I need recommendations on which direction to persue; this would only be
> for 9.3.

I went with the second idea, patch attached. Here are the times:

---------- 9.2 ---------- ------------ 9.3 --------
-- normal -- -- bin-up -- -- normal -- -- bin-up -- pg_upgrade
dump rest dump rest dump rest dump rest git patch
1 0.12 0.06 0.12 0.06 0.11 0.07 0.11 0.07 11.11 11.02
1000 7.22 2.40 4.74 2.78 2.20 2.43 4.04 2.86 19.60 19.25
2000 5.67 5.10 8.82 5.57 4.50 4.97 8.07 5.69 30.55 26.67
4000 13.34 11.13 25.16 12.52 8.95 11.24 16.75 12.16 60.70 52.31
8000 29.12 25.98 59.60 28.08 16.68 24.02 30.63 27.08 123.05 102.78
16000 87.36 53.16 189.38 62.72 31.38 55.37 61.55 62.66 365.71 286.00

You can see a significant speedup with those loops removed. The 16k
case is improved, but still not linear. The 16k dump/restore scale
looks fine, so it must be something in pg_upgrade, or in the kernel.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

Attachment	Content-Type	Size
pg_upgrade.diff	text/x-diff	8.8 KB

In response to

Re: Further pg_upgrade analysis for many tables at 2012-11-12 17:09:08 from Bruce Momjian

Responses

Re: Further pg_upgrade analysis for many tables at 2012-11-12 21:11:22 from Bruce Momjian
Re: Further pg_upgrade analysis for many tables at 2012-11-12 21:14:59 from Alvaro Herrera
Re: Further pg_upgrade analysis for many tables at 2012-11-13 03:44:54 from Ants Aasma

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Bruce Momjian	2012-11-12 21:11:22	Re: Further pg_upgrade analysis for many tables
Previous Message	Tom Lane	2012-11-12 20:53:43	Re: Inadequate thought about buffer locking during hot standby replay