Re: Collations and Replication; Next Steps

From: Greg Stark <stark(at)mit(dot)edu>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Matthew Kelly <mkelly(at)tripadvisor(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Matthew Spilich <mspilich(at)tripadvisor(dot)com>
Subject: Re: Collations and Replication; Next Steps
Date: 2014-09-17 14:46:42
Message-ID: CAM-w4HPaBXFE6NF4MvqkhrPs_FM8G5Zd+VeJTT1VqL3GYxzcwg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 16, 2014 at 11:41 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> The timezone case you highlight here seems quite distinct from what
> Matthew is talking about, because in point of fact the on-disk
> representation is merely *interpreted* with reference to the timezone
> database. So, you could have an inconsistency between standbys
> concerning what the time was in a particular timezone at a particular
> timestamp value as reported by the timestamptz output function, but
> both standbys would be correct on their own terms, which isn't too
> bad.

You could have a problem if you have an expression index on (timestamp
AT TIME ZONE '...'). I may have the expression slightly wrong but I
believe it is posisble to write an immutable expression that depends
on the tzdata data as long as it doesn't depend on not the user's
current time zone (which would be stable but not immutable). The
actual likelihood of that situation might be much lower and the
ability to avoid it higher but in theory I think Peter's right that
it's the same class of problem.

Generally speaking we try to protect against most environment
dependencies that lead to corrupt databases by encoding them in the
control file. Obviously we can't encode an entire collation in the
controlfile though. We could conceivably have a corpus of
representative strings that we sort and then checksum in the
controlfile. It wouldn't be foolproof but if we collect interesting
examples as we find them it might be a worthwhile safety check.

Just brainstorming... I wonder if it would be possible to include any
collation comparisons made in handling an index insert in the xlog
record and have the standby verify those comparisons are valid on the
standby. I guess that would be pretty hard to arrange code-wise since
the comparisons could be coming from anywhere to say nothing of the
wal bloat.

Peter G, could go into more detail about collation versioning? What
would the implications be for Postgres?

--
greg

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 2014-09-17 14:47:33 Re: Collations and Replication; Next Steps
Previous Message Matthew Kelly 2014-09-17 14:06:53 Re: Collations and Replication; Next Steps