Re: Collation versioning

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Robert Haas <robertmhaas(at)gmail(dot)com>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, Douglas Doole <dougdoole(at)gmail(dot)com>, Christoph Berg <myon(at)debian(dot)org>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Collation versioning
Date: 2020-11-02 09:28:15
Message-ID: CAOBaU_a2s4kek+HELEkAa9Y5CA+W+2dS-e7F8K6BNWuh7=xaqg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 2, 2020 at 3:56 PM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
>
> On Mon, Nov 2, 2020 at 7:59 PM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
> > Note that v34 now fails when run on a without that don't have
> > defined(__GLIBC__) (e.g. macos). The failure are in
> > collate.icu.utf8.sql, of the form:
> >
> > - icuidx06_d_en_fr_ga | "default" | up to date
> > + icuidx06_d_en_fr_ga | "default" | version not tracked
> >
> > Given the new approach, the only option I can see is to simply remove
> > any attempt to cover the default collation in the tests. An
> > alternative file would be a pain to maintain and it wouldn't bring any
> > value apart from checking that the default collation is either always
> > tracked or never, but not a mix of those.
>
> Blah, right. Ok, I changed it to output XXX for "default".
>
> I did a bit more language clean up. I fixed the tab completion for
> ALTER INDEX ... ALTER COLLATION. I simplified a couple of tests. I
> dropped the pg_dump TAP test for now (I might see if I can find a
> simpler way to test refobjversion restoration later). I dropped the
> special handling for T_CollateExpr in find_expr_references_walker()
> (it wasn't affecting the test cases we have, and I wasn't entirely
> sure about it; do you see a problem?). I dropped the VACUUM-log-spam
> feature for now (I'm not against it, but it seems like an independent
> enough thing to not include in the initial commit). This brought the
> patch mercifully under 100kB. This is the version I'm planning to
> commit if you don't see anything else.

Thanks a lot for the updated patch! I agree with dropping the
T_CollateExpr test in find_exp_references_walker(). This was more a
hack than anything, and it should be better addressed by an approach
that can actually handle all cases rather than some random ones.

I'm also ok with dropping the logging from VACUUM from the initial
patch, but this seems like an important codepath to handle (and an
easy one), so I hope we can address that afterwards.

I just have a minor nit:

+ /* Do they match? */
+ if (strcmp(current_version, version) != 0)
+ {
+ /*
+ * The version has changed, probably due to an OS/library upgrade or
+ * streaming replication between different OS/library versions.
+ */
+ ereport(WARNING,
+ (errmsg("index \"%s\" depends on collation \"%s\"
version \"%s\", but the current version is \"%s\"",
+ get_rel_name(context->relid),
+ get_collation_name(otherObject->objectId),
+ version,
+ current_version),
+ errdetail("The index may be corrupted due to changes
in sort order."),
+ errhint("REINDEX to avoid the risk of corruption.")));
+ }
+
+ /* Remember not to complain about this collation again. */
+ context->checked_colls = lappend_oid(context->checked_colls,
+ otherObject->objectId);

It's maybe not immediately obvious that it's safe to save the
collation oid at that point, or that it'll always be. Maybe move it
in the if branch above to make it extra clear?

and also this will probably need an update:

-#define CATALOG_VERSION_NO 202010291
+#define CATALOG_VERSION_NO 202011013

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Erik Rijkers 2020-11-02 09:36:53 Re: Split copy.c
Previous Message Jinbao Chen 2020-11-02 09:16:26 Add table AM 'tid_visible'