Quick Links

Re: Speeding up pg_upgrade

From:	Stephen Frost <sfrost(at)snowman(dot)net>
To:	Alexander Kukushkin <cyberdemn(at)gmail(dot)com>
Cc:	Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Speeding up pg_upgrade
Date:	2017-12-07 15:37:30
Message-ID:	20171207153730.GB4628@tamriel.snowman.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Alexander,

* Alexander Kukushkin (cyberdemn(at)gmail(dot)com) wrote:
> Couple of months ago we at Zalando upgraded a few databases of different
> sizes to 9.6.

Thanks for sharing your experience!

> During preparations to the I've found 2.5 pain-points:
>
> 1. We are using schema-based api deployment. Basically ~every week we
> create a new schema in the database and hundreds of stored procedures in it.
> Off course we remove old API schemas and trying not to keep more than
> last 10. Before the upgrade we basically dropped all API schemas except the
> one used in production.
> And even in this case dump-restore phase was taking much more time than
> relinking of datafiles.
> Unfortunately I don't have any numbers right now, but usually run of
> pg_upgrade was taking about 30-35 seconds, and about 2/3 of the time was
> spend in dump-restore.

Ok, so eliminating 2/3 of the time would mean bringing it down to more
like 10 seconds. That certainly seems worthwhile to me. With the
linking time being much less than the dump/restore, we could at least
consider moving forward with Bruce's original idea where we do the
dump/restore while the system is online but then the linking with it
offline and get a serious performance boost out of it. That also avoids
the issue with new files showing up while the system is running that I
brought up when we were talking about having the linking done with the
system online.

> 2 ANALYZE phase is a pain. I think everybody agrees with it.
>
> 2.5 Usually ANALYZE stage 1 completes quite fast and performance becomes
> reasonable, except one case: some of the columns might have non default
> statistics target.

Ok, if the stage-1 is very fast and performance is reasonable enough
after that then perhaps it's not so bad to keep it as-is for now and
focus on the dump/restore time. That said, we should certainly also
work on improving this too.

> It breaks `vacuumdb --analyze-in-stages`, because those specific
> columns it will not use value of default_statistics_target provided by
> vacuumdb.
> What I did - reset those non default values right before running
> pg_upgrade and restored them only when analyze was completed. Off course
> after that I've re-analyze those columns.

Ah, yeah, ouch, that's unfortuante.. I wonder if there's something we
could do there to fix it..

Thanks!

Stephen

In response to

Re: Speeding up pg_upgrade at 2017-12-07 09:31:10 from Alexander Kukushkin

Responses

Re: Speeding up pg_upgrade at 2017-12-07 16:02:32 from Alvaro Herrera
Re: Speeding up pg_upgrade at 2017-12-08 06:24:21 from Bruce Momjian

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Bernd Helmle	2017-12-07 15:47:18	PostgreSQL crashes with SIGSEGV
Previous Message	Craig Ringer	2017-12-07 14:56:33	Re: Postgres with pthread