Re: Collation versioning

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, Douglas Doole <dougdoole(at)gmail(dot)com>, Christoph Berg <myon(at)debian(dot)org>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Collation versioning
Date: 2020-09-08 14:45:07
Message-ID: 20200908144507.GA57691@nol
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Sep 07, 2020 at 12:17:41PM +0900, Michael Paquier wrote:
> On Fri, Aug 14, 2020 at 11:02:35AM +0200, Julien Rouhaud wrote:
> > On Fri, Aug 14, 2020 at 04:37:46PM +0900, Michael Paquier wrote:
> >> + /*
> >> + * XXX For deterministic transaction, se should only track the
> >> version
> >> + * if the AM relies on a stable ordering.
> >> + */
> >> + if (determ_colls)
> >> + {
> >> + /* XXX check if the AM relies on a stable ordering */
> >> + recordDependencyOnCollations(&myself, determ_colls, true);
> >> Some cleanup needed here? Wouldn't it be better to address the issues
> >> with stable ordering first?
> >
> > Didn't we just agreed 3 mails ago to *not* take care of that in this patch, and
> > add an extensible solution for that later? I kept the XXX comment to make it
> > extra clear that this will be addressed.
>
> FWIW, I tend to prefer the approach where we put in place the
> necessary infrastructure first, and then have a feature rely on what
> we think is the most correct. This way, we avoid having any moment in
> the code history where we have something that we know from the start
> is not covered.

I usually agree with that approach, I'm just afraid that getting a consensus on
the best way to do that will induce a lot of discussions, while this is
probably a corner case due to general usage of hash and bloom indexes.

Anyway, in order to make progress on that topic I attach an additional POC
commit to add the required infrastructure to handle this case in
v29-0001-Add-a-new-amnostablecollorder-flag-in-IndexAmRou.patch.

Here's the rationale for this new flag, extracted from the patch's commit
message:

Add a new amnostablecollorder flag in IndexAmRoutine.

This flag indicates if the access method does not rely on a stable collation
ordering for deterministic collation, i.e. would not be corrupted if the
underlying collation library changes its ordering. This is done this way to
make sure that if any external access method isn't updated to correctly setup
this flag it will be assumed to rely on a stable collation ordering as this
should be the case for the majority of access methods.

This flag will be useful for an upcoming commit that will add detection of
possibly corrupted index due to changed collation library version.

>
> The patch set needs a rebase. There are conflicts coming at least
> from pg_depend.c where I switched the code to use multi-INSERTs for
> catalog insertions.

Fixed.

Attachment Content-Type Size
v29-0001-Add-a-new-amnostablecollorder-flag-in-IndexAmRou.patch text/plain 6.9 KB
v29-0002-Remove-pg_collation.collversion.patch text/plain 27.7 KB
v29-0003-Add-pg_depend.refobjversion.patch text/plain 12.3 KB
v29-0004-Track-collation-versions-for-indexes.patch text/plain 104.6 KB
v29-0005-Add-ALTER-INDEX-.-ALTER-COLLATION-.-REFRESH-VERS.patch text/plain 14.8 KB
v29-0006-Doc-Add-Collation-Versions-section.patch text/plain 2.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2020-09-08 14:49:19 Re: Logical Replication - detail message with names of missing columns
Previous Message Bharath Rupireddy 2020-09-08 14:23:24 Re: Logical Replication - detail message with names of missing columns