Re: Collation versioning

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Christoph Berg <myon(at)debian(dot)org>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, Michael Paquier <michael(at)paquier(dot)xyz>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Douglas Doole <dougdoole(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Collation versioning
Date: 2019-11-12 14:27:37
Message-ID: CAOBaU_bVxcvKHF64utKWZ=F927_z-_GP0DF7Uf+i9Uhi5jo6-g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Nov 10, 2019 at 10:08 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
>
> Some more thoughts:
>
> 1. If you create an index on an expression that includes a COLLATE or
> a partial index that has one in the WHERE clause, you get bogus
> warnings:
>
> postgres=# create table t (v text);
> CREATE TABLE
> postgres=# create index on t(v) where v > 'hello' collate "en_NZ";
> WARNING: index "t_v_idx3" depends on collation "en_NZ" version "",
> but the current version is "2.28"
> DETAIL: The index may be corrupted due to changes in sort order.
> HINT: REINDEX to avoid the risk of corruption.
> CREATE INDEX
>
> postgres=# create index on t((case when v < 'x' collate "en_NZ" then
> 'foo' else 'bar' end));
> WARNING: index "t_case_idx" depends on collation "en_NZ" version "",
> but the current version is "2.28"
> DETAIL: The index may be corrupted due to changes in sort order.
> HINT: REINDEX to avoid the risk of corruption.
> CREATE INDEX
>
> That's because the 0003 patch only calls recordDependencyOnVersion()
> for simple attribute references. When
> recordDependencyOnSingleRelExpr() is called by index_create() to
> analyse ii_Expressions and ii_Predicate, it's going to have to be
> smart enough to detect collation references and record the versions.
> There is also some more code that ignores pinned collations hiding in
> there.
>
> This leads to the difficult question of how you recognise a real
> dependency on a collation's version in an expression. I have some
> vague ideas but haven't seriously looked into it yet. (The same
> question comes up for check constraint -> collation dependencies.)

Oh good point. A simple and exhaustive way to deal with that would be
to teach recordMultipleDependencies() to override isObjectPinned() and
retrieve the collation version if the referenced object is a collation
and it's neither C or POSIX collation. It means that we could also
remove the extra "version" argument and get rid of
recordDependencyOnVersion to simply call recordMultipleDependencies in
create_index for direct column references having a collation.

Would it be ok to add this kind of knowledge in
recordMultipleDependencies() or is it too hacky?

> 2. If you create a composite type with a text attribute (with or
> without an explicit collation), and then create an index on a column
> of that type, we don't record the dependency.
>
> postgres=# create type my_type as (x text collate "en_NZ");
> CREATE TYPE
> postgres=# create table t (v my_type);
> CREATE TABLE
> postgres=# create index on t(v);
> CREATE INDEX
> postgres=# select * from pg_depend where refobjversion != '';
> classid | objid | objsubid | refclassid | refobjid | refobjsubid |
> refobjversion | deptype
> ---------+-------+----------+------------+----------+-------------+---------------+---------
> (0 rows)
>
> I think create_index() will need to perform recursive analysis on
> composite types to look for text attributes, when they appear as
> simple attributes, and then add direct dependencies index -> collation
> to capture the versions. Then we might need to do the same for
> composite types hiding inside ii_Expressions and ii_Predicate (once we
> figure out what that really means).

Isn't that actually a bug? For instance such an index will have a 0
indcollation in pg_index, and according to pg_index documentation:

" this contains the OID of the collation to use for the index, or zero
if the column is not of a collatable data type."

You can't use a COLLATE expression on such data type, but it still has
a collation used.

> 3. Test t/002_pg_dump.pl in src/bin/pg_upgrade fails.

Apparently neither "make check" nor "make world" run this test :(
This was broken due collversion support in pg_dump, I have fixed it
locally.

> 4. In the warning message we should show get_collation_name() instead
> of the OID.

+1, I also fixed it locally.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2019-11-12 15:11:23 Re: Option to dump foreign data in pg_dump
Previous Message Luis Carril 2019-11-12 14:21:38 Re: Option to dump foreign data in pg_dump