Re: Q: fixing collation version mismatches

From: Karsten Hilbert <Karsten(dot)Hilbert(at)gmx(dot)net>
To: Daniel Verite <daniel(at)manitou-mail(dot)org>
Cc: Christophe Pettus <xof(at)thebuild(dot)com>, pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: Q: fixing collation version mismatches
Date: 2022-11-14 17:31:43
Message-ID: Y3J7f9HY3AivTbTz@hermes.hilbert.loc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Am Mon, Nov 14, 2022 at 05:42:16PM +0100 schrieb Daniel Verite:

> > Which is why my question still stands: does the above
> > three-strikes operation safely take care of any collation
> > issues that may currently exist in a database ?
>
> For the indexes, yes, but theorically, all constraints involving collatable
> types need a recheck.
>
> For foreign key constraints with non-deterministic collations, there
> might be equality tests that pass with an older Unicode version and fail
> with a newer Unicode version.

Which gives weight to the argument that using real-world data
(instead of surrogate keys) may lead to trouble.

> For check constraints as well, checks applied to strings with recent
> Unicode characters can give different results after an upgrade.

Thanks for pointing this out more clearly. My thinking
already evolved towards also including VALIDATE CONSTRAINT.

I shall, for the record, update the sequence in question:

-- indices
REINDEX DATABASE db_in_question;
-- constraints (check, foreign key)
UPDATE pg_constraint SET convalidated = false WHERE all_check_and_FK_constraints;
ALTER TABLE table_with_constraint VALIDATE CONSTRAINT constraint_on_that_table;
-- other things, see below
-- ...
-- refresh collation versions if no errors above
ALTER DATABASE db_in_question REFRESH COLLATION VERSION;
ALTER COLLATION every_collation_from_pg_collation REFRESH VERSION;

What else needs to be taken care of, and how ?

partitions

Need to re-sort rows into the proper partition as needed.

Can this be achievd by

UPDATE each_partitioned_table SET each_partitioned_key = each_partitioned_key;

?

Courtesy of (ddl-partitioning.html#DDL-PARTITIONING-CONSTRAINT-EXCLUSION):

Updating the partition key of a row will cause it to
be moved into a different partition if it no longer
satisfies the partition bounds of its original
partition.

range types

Will this

UPDATE table_with_range_type_column SET ranged_column = ranged_column

find all relevant issues ?

domains

Will this

UPDATE table_with_domain_type_column SET domained_column = domained_column

find all relevant issues ?

custom types

??

function immutability ??

It can be argued that functions marked IMMUTABLE really
are not in case they involve sorting of a collatable data
type, and are thus wrongly marked as IMMUTABLE.

IOW pre-existing user error.

If all this has been discussed in detail, I'd be glad for a
pointer into the archive.

Thanks,
Karsten
--
GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B

In response to

Browse pgsql-general by date

  From Date Subject
Next Message klaus.mailinglists 2022-11-14 21:04:09 PANIC: could not flush dirty data: Cannot allocate memory
Previous Message Daniel Verite 2022-11-14 16:42:16 Re: Q: fixing collation version mismatches