Re: proposal: change behavior on collation version mismatch

From: Jeremy Schneider <schnjere(at)amazon(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: proposal: change behavior on collation version mismatch
Date: 2023-11-27 23:35:19
Message-ID: c6fa7b2e-5c69-473a-9a7d-7eaa729e4cc7@amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11/27/23 12:29 PM, Jeff Davis wrote:
>> 2) "most users would rather have ease-of-use than 100% safety, since
>> it's uncommon"
>>
>> And I think this led to the current behavior of issuing a warning
>> rather
>> than an error
> The elevel trade-off is *availability* vs safety, not ease-of-use vs
> safety. It's harder to reason about what most users might want in that
> situation.

I'm not in agreement with the idea that this is hard to reason about;
I've always thought durability & correctness is generally supposed to be
prioritized over availability in databases. For many enterprise
customers, if they ask why their database wouldn't accept connections
after an OS upgrade and we explained that durability & correctness is
prioritized over availability, I think they would agree we're doing the
right thing.

In practice this always happens after a major operating system update of
some kind (it would be an unintentional bug in a minor OS upgrade).  In
most cases, I hope the error will happen immediately because users
ideally won't even be able to connect (for DB-level glibc and for ICU
default setting).  Giving a hard error quickly after an OS upgrade is
actually pretty easy for most people to deal with. For most users,
they'll immediately understand that something went wrong related to the
OS upgrade.  And basic testing would turn up connection errors before
the production upgrade as long as a connection was attempted as part of
the test.

It seems to me that much of the hand-wringing is around taking a hard
line on not allowing in-place OS upgrades. We're all aware that when
you're talking about tens of terrabytes, in-place upgrade is just a lot
more convenient and easy than the alternatives. And we're aware that
some other relational databases support this (and also bundle collation
libs directly in the DB rather than using external libraries).

I myself wouldn't frame this as an availability issue, I think it's more
about ease-of-use in the sense of allowing low-downtime major OS
upgrades without the complexity of logical replication (but perhaps with
a risk of data loss, because with unicode nobody can actually be 100%
sure there's no risky characters stored in the DB, and even those of us
with extensive expert knowledge struggle to accurately characterize the
risk level).

The hand-wringing often comes down to the argument "but MAYBE en_US
didn't change in those 3 major version releases of ICU that you jumped
across to land a new Ubuntu LTS release" ~~ however I believe it's one
thing to make this argument with ISO 8859 but in the unicode world en_US
has default sort rules for japanese, chinese, arabic, cyrilic, nepalese,
and all kinds of strings with nonsensical combinations of all these
characters.  After some years of ICU and PG, I'm just coming to a
conclusion that the right thing to do is stay safe and don't change ICU
versions (or glibc versions) for existing databases in-place.

-Jeremy

--
Jeremy Schneider
Performance Engineer
Amazon Web Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2023-11-27 23:39:26 Re: [PATCH] Add CHECK_FOR_INTERRUPTS in scram_SaltedPassword loop.
Previous Message David Rowley 2023-11-27 23:16:21 Re: Don't use bms_membership in places where it's not needed