| From: | Julien Rouhaud <rjuju123(at)gmail(dot)com> |
|---|---|
| To: | Karsten Hilbert <Karsten(dot)Hilbert(at)gmx(dot)net> |
| Cc: | pgsql-general(at)lists(dot)postgresql(dot)org |
| Subject: | Re: Q: documentation improvement re collation version mismatch |
| Date: | 2022-11-10 11:36:21 |
| Message-ID: | 20221110113621.lkjh25rnzlacd635@jrouhaud |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-general |
On Thu, Nov 10, 2022 at 11:47:01AM +0100, Karsten Hilbert wrote:
> Thanks, Julien, for your explanation.
>
> > > regarding changed collation versions this
> > >
> > > https://www.postgresql.org/docs/devel/sql-altercollation.html
> > >
> > > says:
> > >
> > > The following query can be used to identify all
> > > collations in the current database that need to be
> > > refreshed and the objects that depend on them:
> > >
> > > SELECT pg_describe_object(refclassid, refobjid, refobjsubid) AS "Collation",
> > > pg_describe_object(classid, objid, objsubid) AS "Object"
> > > FROM pg_depend d JOIN pg_collation c
> > > ON refclassid = 'pg_collation'::regclass AND refobjid = c.oid
> > > WHERE c.collversion <> pg_collation_actual_version(c.oid)
> > > ORDER BY 1, 2;
> > >
> > > I feel the result of that query can be slightly surprising
> > > because it does not return (to my testing) any objects
> > > depending on the database default collation, nor the database
> > > itself (as per a collation version mismatch in pg_database).
> >
> > Indeed. The default collation is "pinned", so we don't record any dependency
> > on it.
>
> Indirectly we do, don't we ? Or else
>
> > > WHERE
> > > collprovider IN ('d', 'c')
>
> would not make much sense, right ?
What I meant is that we don't insert record in pg_depend to track dependencies
on pinned object, including the default collation. The collprovider here comes
from pg_index.indcollation which is a different thing. It can indeed store the
default collation, but it's only a small step toward less false negative.
Try that query with e.g.
CREATE INDEX ON sometable ( (somecol > 'somevalue') );
or
CREATE INDEX ON sometable (someid) WHERE somecol > 'somevalue';
Both clearly can get corrupted if the underlying collation library changes the
result of somecol > 'somevalue', but wouldn't be detected by that query. There
are likely a lot more cases that would be missed, you can refer to the
discussions from a couple years ago when we tried to properly track all index
collation dependencies.
> The comment above the query in the official documentation is rather assertive
> (even if may true to the letter) and may warrant some more cautionary
> wording ? Added, perhaps, some variation of this:
>
> > For now, the only safe way to go is either reindex everything, or everything
> > except some safe cases (non-partial indexes on plain-non-collatable datatypes
> > only).
I think the comment is very poorly worded, as it leads readers to believe that
objects with a pg_depend dependency on a collation are the only one that would
get corrupted in case of glibc/ICU upgrade.
I agree that there should be a big fat red warning saying something like
"reindex everything if there's any discrepancy between the recorded collation
version and the currently reported one unless you REALLY know what you're
doing."
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Ron | 2022-11-10 14:04:37 | Re: Q: documentation improvement re collation version mismatch |
| Previous Message | Karsten Hilbert | 2022-11-10 10:47:01 | Aw: Re: Q: documentation improvement re collation version mismatch |