From: | Andrey Borodin <x4mmm(at)yandex-team(dot)ru> |
---|---|
To: | Arseny Sher <a(dot)sher(at)postgrespro(dot)ru> |
Cc: | Stas Kelvich <s(dot)kelvich(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Global snapshots |
Date: | 2018-09-24 08:58:48 |
Message-ID: | 19F4097D-0525-4C4E-A40F-2C8C2B0787CF@yandex-team.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi!
I want to review this patch set. Though I understand that it probably will be quite long process.
I like the idea that with this patch set universally all postgres instances are bound into single distributed DB, even if they never heard about each other before :) This is just amazing. Or do I get something wrong?
I've got few questions:
1. If we coordinate HA-clusters with replicas, can replicas participate if their part of transaction is read-only?
2. How does InDoubt transaction behave when we add or subtract leap seconds?
Also, I could not understand some notes from Arseny:
> 25 июля 2018 г., в 16:35, Arseny Sher <a(dot)sher(at)postgrespro(dot)ru> написал(а):
>
> * One drawback of these patches is that only REPEATABLE READ is
> supported. For READ COMMITTED, we must export every new snapshot
> generated on coordinator to all nodes, which is fairly easy to
> do. SERIALIZABLE will definitely require chattering between nodes,
> but that's much less demanded isolevel (e.g. we still don't support
> it on replicas).
If all shards are executing transaction in SERIALIZABLE, what anomalies does it permit?
If you have transactions on server A and server B, there are transactions 1 and 2, transaction A1 is serialized before A2, but B1 is after B2, right?
Maybe we can somehow abort 1 or 2?
>
> * Another somewhat serious issue is that there is a risk of recency
> guarantee violation. If client starts transaction at node with
> lagging clocks, its snapshot might not include some recently
> committed transactions; if client works with different nodes, she
> might not even see her own changes. CockroachDB describes at [1] how
> they and Google Spanner overcome this problem. In short, both set
> hard limit on maximum allowed clock skew. Spanner uses atomic
> clocks, so this skew is small and they just wait it at the end of
> each transaction before acknowledging the client. In CockroachDB, if
> tuple is not visible but we are unsure whether it is truly invisible
> or it's just the skew (the difference between snapshot and tuple's
> csn is less than the skew), transaction is restarted with advanced
> snapshot. This process is not infinite because the upper border
> (initial snapshot + max skew) stays the same; this is correct as we
> just want to ensure that our xact sees all the committed ones before
> it started. We can implement the same thing.
I think that this situation is also covered in Clock-SI since transactions will not exit InDoubt state before we can see them. But I'm not sure, chances are that I get something wrong, I'll think more about it. I'd be happy to hear comments from Stas about this.
>
>
> * 003_bank_shared.pl test is removed. In current shape (loading one
> node) it is useless, and if we bombard both nodes, deadlock surely
> appears. In general, global snaphots are not needed for such
> multimaster-like setup -- either there are no conflicts and we are
> fine, or there is a conflict, in which case we get a deadlock.
Can we do something with this deadlock? Will placing an upper limit on time of InDoubt state fix the issue? I understand that aborting automatically is kind of dangerous...
Also, currently hanging 2pc transaction can cause a lot of headache for DBA. Can we have some kind of protection for the case when one node is gone permanently during transaction?
Thanks!
Best regards, Andrey Borodin.
From | Date | Subject | |
---|---|---|---|
Next Message | Alexander Korotkov | 2018-09-24 09:19:06 | Re: Pluggable Storage - Andres's take |
Previous Message | Chris Travers | 2018-09-24 08:15:18 | Re: Proposal for Signal Detection Refactoring |