Re: Collation versioning

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Christoph Berg <myon(at)debian(dot)org>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, Michael Paquier <michael(at)paquier(dot)xyz>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Douglas Doole <dougdoole(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Collation versioning
Date: 2019-11-24 19:17:01
Message-ID: CAOBaU_bF-mhb5--HOzgFebcke3J0vYGOAaeb24bipeYQrOyw0g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Nov 12, 2019 at 10:16 PM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
>
> On Wed, Nov 13, 2019 at 3:27 AM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
> > On Sun, Nov 10, 2019 at 10:08 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> > > That's because the 0003 patch only calls recordDependencyOnVersion()
> > > for simple attribute references. When
> > > recordDependencyOnSingleRelExpr() is called by index_create() to
> > > analyse ii_Expressions and ii_Predicate, it's going to have to be
> > > smart enough to detect collation references and record the versions.
> > > There is also some more code that ignores pinned collations hiding in
> > > there.
[...]

Indeed. Now, using a composite type in an expression index, I can see that eg.

CREATE TYPE mytype AS (fr text COLLATE "fr-x-icu", en text COLLATE "en-x-icu");
CREATE TABLE test1(id integer, myval mytype);
CREATE INDEX ON sometable (somecol) WHERE (mytype).fr_name = 'meh'

does create a dependency on fr-x-icu collation, because collations are
checked for FieldSelect nodes (which indeed ignores default
collation), but eg.

CREATE INDEX idx2 ON test1(id) WHERE myval = ('foo', 'bar');

won't, so I confirm that recordDependencyOnSingleRelExpr() isn't
bullet proof either for finding collation dependencies.

> > > I think create_index() will need to perform recursive analysis on
> > > composite types to look for text attributes, when they appear as
> > > simple attributes, and then add direct dependencies index -> collation
> > > to capture the versions. Then we might need to do the same for
> > > composite types hiding inside ii_Expressions and ii_Predicate (once we
> > > figure out what that really means).

I did write some code that can extract all collations that are used by
a datatype, which seems to work as intended over many combinations of
composite / array / domain types used in index simple attributes. I'm
not sure if I should change find_expr_references_walker (called by
recordDependencyOnExpr) to also track those new dependencies in
ii_Expression and ii_Predicate, as it'll also add unneeded
dependencies for other callers. And if I should add version detection
there too or have recordMultipleDependencies() automatically take care
of this.

> > A simple and exhaustive way to deal with that would be
> > to teach recordMultipleDependencies() to override isObjectPinned() and
> > retrieve the collation version if the referenced object is a collation
> > and it's neither C or POSIX collation
> >
> That doesn't seem like the right place; that's a raw data insertion
> function, though... I guess it does already have enough brains to skip
> pinned objects. Hmm.

Another point is that unless we also do an additional check to see
what relkind is referencing the collation, it'll record a version
string for types and other objects.

> > Isn't that actually a bug? For instance such an index will have a 0
> > indcollation in pg_index, and according to pg_index documentation:
> >
> > " this contains the OID of the collation to use for the index, or zero
> > if the column is not of a collatable data type."
> >
> > You can't use a COLLATE expression on such data type, but it still has
> > a collation used.
>
> I don't think it's a bug. The information is available, but you have
> to follow the graph to get it. Considering that the composite type
> could be something like CREATE TYPE my_type AS (fr_name text COLLATE
> "fr_CA", en_name text COLLATE "en_CA"), there is no single collation
> you could put into pg_index.indcollation anyway. As for pg_depend,
> it's currently enough for the index to depend on the type, and the
> type to depend on the collation, because the purpose of dependencies
> is to control dropping and dumping order, but for our new purpose we
> also need to create direct dependencies index -> "fr_CA", index ->
> "en_CA" so we can record the current versions.

Oh right, I didn't think about that.

> > > 3. Test t/002_pg_dump.pl in src/bin/pg_upgrade fails.
> >
> > Apparently neither "make check" nor "make world" run this test :(
> > This was broken due collversion support in pg_dump, I have fixed it
> > locally.
>
> make check-world

Thanks!

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Ranier Vilela 2019-11-24 19:21:06 RE: [PATCH] Fix possible underflow in expression (maxoff - 1)
Previous Message Peter Geoghegan 2019-11-24 19:07:37 Re: [PATCH] Fix possible underflow in expression (maxoff - 1)