Re: Bogus collation version recording in recordMultipleDependencies

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Julien Rouhaud <rjuju123(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Bogus collation version recording in recordMultipleDependencies
Date: 2021-04-16 17:45:49
Message-ID: 4166274.1618595149@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> Oh, I bet it's "C.utf8", because I can reproduce the failure with that.
> This crystallizes a nagging feeling I'd had that you were misdescribing
> the collate.icu.utf8 test as not being run under --no-locale. Actually,
> it's only skipped if the encoding isn't UTF8, not the same thing.
> I think we need to remove the default-collation cases from that test too.

Hmm ... this is more subtle than it seemed.

I tried to figure out where the default-collation dependencies were coming
from, and it's quite non-obvious, at least for some of them. Observe:

u8de=# create table t1 (f1 text collate "fr_FR");
CREATE TABLE
u8de=# create index on t1(f1) where f1 > 'foo';
CREATE INDEX
u8de=# SELECT objid::regclass, refobjid::regcollation, refobjversion
FROM pg_depend d
LEFT JOIN pg_class c ON c.oid = d.objid
WHERE refclassid = 'pg_collation'::regclass
AND coalesce(relkind, 'i') = 'i'
AND relname LIKE 't1_%';
objid | refobjid | refobjversion
-----------+-----------+---------------
t1_f1_idx | "fr_FR" | 2.28
t1_f1_idx | "fr_FR" | 2.28
t1_f1_idx | "default" | 2.28
(3 rows)

(The "default" item doesn't show up if default collation is C,
which is what's causing the buildfarm instability.)

Now, it certainly looks like that index definition ought to only
have fr_FR dependencies. I dug into it and discovered that the
reason we're coming up with a dependency on "default" is that
the WHERE clause looks like

{OPEXPR
:opno 666
:opfuncid 742
:opresulttype 16
:opretset false
:opcollid 0
:inputcollid 14484
:args (
{VAR
:varno 1
:varattno 1
:vartype 25
:vartypmod -1
:varcollid 14484
:varlevelsup 0
:varnosyn 1
:varattnosyn 1
:location 23
}
{CONST
:consttype 25
:consttypmod -1
:constcollid 100
:constlen -1
:constbyval false
:constisnull false
:location 28
:constvalue 7 [ 28 0 0 0 102 111 111 ]
}
)
:location 26
}

So sure enough, the comparison operator's inputcollid is
fr_FR, but the 'foo' constant has constcollid = "default".
That will have exactly zero impact on the semantics of the
expression, but dependency.c doesn't realize that and
reports it as a dependency anyway.

I feel like this is telling us that there's a fundamental
misunderstanding in find_expr_references_walker about which
collation dependencies to report. It's reporting all the
leaf-node collations, and ignoring the ones that actually
count semantically, that is the inputcollid fields of
function and operator nodes.

Not sure what's the best thing to do here. Redesigning
this post-feature-freeze doesn't seem terribly appetizing,
but on the other hand, this index collation recording
feature has put a premium on not overstating the collation
dependencies of an expression. We don't want to tell users
that an index is broken when it isn't really.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2021-04-16 17:54:49 Re: Forget close an open relation in ReorderBufferProcessTXN()
Previous Message Tom Lane 2021-04-16 17:13:54 Re: Bogus collation version recording in recordMultipleDependencies