Re: Preserve versions of initdb-created collations in pg_upgrade

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Preserve versions of initdb-created collations in pg_upgrade
Date: 2019-10-29 02:33:12
Message-ID: CA+hUKGL4topbAZQrRui_wa-vUcLn_Ug6H2UwF1JWUQUqfWLwEg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Oct 29, 2019 at 1:52 AM Peter Eisentraut
<peter(dot)eisentraut(at)2ndquadrant(dot)com> wrote:
> As mentioned in [0], pg_upgrade currently does not preserve the version
> of collation objects created by initdb. Here is an attempt to fix that.
>
> The way I deal with this here is by having the binary-upgrade mode in
> pg_dump delete all the collations created by initdb and then dump out
> CREATE COLLATION commands with version information normally.

This seems to be basically OK.

It does mean that the target database has collation OIDs >=
FirstNormalObjectId. That is, they don't look like initdb-created
objects, which is OK because they aren't, I'm just highlighting this
to see if anyone else sees a problem with it. Suppose you pg_upgrade
again: now you'll dump these collations just as you did the first time
around, because they look exactly like user-defined collations. It
also means that if you pg_upgrade to a target cluster created by a
build without ICU we'll try to create ICU collations and that'll fail
("ICU is not supported in this build"), whereas before if had ICU
collations and didn't ever make use of them, you'd be able to do such
an upgrade; again this doesn't seem like a major problem, it's just an
observation about an edge case. One more thing to note is if you
upgrade from 12 to 13 on a glibc system, I think we'll automatically
pick up the *current* version when creating the collations in the target
DB, which seems to be OK but it is a choice to default to assuming
that the database's indexes are not corrupted. Another observation is
that you finish up with different OIDs in each database you upgrade,
which again doesn't seem like a problem in itself. It is slightly odd that
template1 finishes up with the old initdb's template1 collatoins, rather
than the new initdb's opinion of the current set of collations, but I am
not sure if it's a problem. I think it has to be like that, because you
might have created other stuff that depends on those collations in your
source template1 database, and so you have to preserve the versions.

> I had originally imagined doing some kind of ALTER COLLATION (or perhaps
> a direct UPDATE pg_collation) to update the version information, but
> that doesn't really work because we don't know whether the collation
> object with a given name in the new cluster is the same as the one in
> the old cluster. So it seems more robust to just delete all existing
> collations and create them from scratch.
>
> Thoughts?

Seems to work as described with -E UTF-8, but it fails with clusters
using -E SQL_ASCII. That causes the pg_upgrade check to fail on
machines where that is the default encoding chosen by initdb (where
unpatched master succeeds):

pg_restore: creating COLLATION "pg_catalog.af-NA-x-icu"
pg_restore: while PROCESSING TOC:
pg_restore: from TOC entry 1700; 3456 12683 COLLATION af-NA-x-icu tmunro
pg_restore: error: could not execute query: ERROR: collation
"pg_catalog.af-NA-x-icu" for encoding "SQL_ASCII" does not exist
Command was: ALTER COLLATION pg_catalog."af-NA-x-icu" OWNER TO tmunro;

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-10-29 03:00:28 Getting psql to redisplay command after \e
Previous Message Michael Paquier 2019-10-29 02:20:08 Re: v12.0: interrupt reindex CONCURRENTLY: ccold: ERROR: could not find tuple for parent of relation ...