Better Upgrades

From: David Fetter <david(at)fetter(dot)org>
To: PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Better Upgrades
Date: 2018-02-06 00:09:18
Message-ID: 20180206000917.GP18043@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Folks,

While chatting with Bruce about how to make something better than
pg_upgrade, we (and by "we," I mean mostly Bruce) came up with the
following.

What needs improvement:

- pg_upgrade forces a down time event, no matter how cleverly it's done.
- pg_upgrade is very much a blocker for on-disk format changes.

The proposal:

- Add a new script--possibly Perl or Bash, which would:
- Initdb a new cluster with the new version of PostgreSQL and a
different port.
- Start logical replication from the old version to the new
version.
- Poll until a pre-determined default amount of replication lag was observed, then:
* Issue an ALTER SYSTEM on the new server to change its port to the old server's
* Issue a pg_ctl stop -w to the old server
* Issue a pg_ctl restart on the new server
* Happiness!

Assumptions underlying it:

- Disk and similar resources are cheap enough for most users that
doubling up during the upgrade is feasible.
- The default upgrade path should require exactly one step.
- Errors do not, by and large, have the capacity to violate an SLA.

The proposal has blockers:

- We don't actually have logical decoding for DDL, although I'm given
to understand that Álvaro Herrera has done some yeoman follow-up
work on Dimitri Fontaine's PoC patches.
- We don't have logical decoding for DCL (GRANT/REVOKE)

We also came up with and, we believe, addressed an important issue,
namely how to ensure continuity. When we issue a `pg_ctl stop -w`,
that's short for "Cancel current commands and stop cleanly." At this
point, the new server will not have WAL to replay, so a pg_ctl restart
will load the new configuration and come up pretty much immediately,
and the next try will find a brand new server without a down time
event.

Does this seem worth coding up in its current form?

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David G. Johnston 2018-02-06 00:21:13 Re: Better Upgrades
Previous Message Edmund Horner 2018-02-05 23:50:40 psql tab completion vs transactions