Re: Global snapshots

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
Cc: "Andrey V(dot) Lepikhov" <a(dot)lepikhov(at)postgrespro(dot)ru>, movead(dot)li(at)highgo(dot)ca, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Global snapshots
Date: 2020-07-13 11:18:07
Message-ID: CAA4eK1KAB=YbvDd3a8g2qNj2QNrQANnrpNmoaNCtEMLfaiFjrg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jul 10, 2020 at 8:46 AM Masahiko Sawada
<masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:
>
> On Wed, 8 Jul 2020 at 21:35, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> >
> > Cool. While studying, if you can try to think whether this approach is
> > different from the global coordinator based approach then it would be
> > great. Here is my initial thought apart from other reasons the global
> > coordinator based design can help us to do the global transaction
> > management and snapshots. It can allocate xids for each transaction
> > and then collect the list of running xacts (or CSN) from each node and
> > then prepare a global snapshot that can be used to perform any
> > transaction. OTOH, in the design proposed in this patch, we don't need any
> > coordinator to manage transactions and snapshots because each node's
> > current CSN will be sufficient for snapshot and visibility as
> > explained above.
>
> Yeah, my thought is the same as you. Since both approaches have strong
> points and weak points I cannot mention which is a better approach,
> but that 2PC patch would go well together with the design proposed in
> this patch.
>

I also think with some modifications we might be able to integrate
your 2PC patch with the patches proposed here. However, if we decide
not to pursue this approach then it is uncertain whether your proposed
patch can be further enhanced for global visibility. Does it make
sense to dig the design of this approach a bit further so that we can
be somewhat more sure that pursuing your 2PC patch would be a good
idea and we can, in fact, enhance it later for global visibility?
AFAICS, Andrey has mentioned couple of problems with this approach
[1], the details of which I am also not sure at this stage but if we
can dig those it would be really great.

> > Now, sure this assumes that there is no clock skew
> > on different nodes or somehow we take care of the same (Note that in
> > the proposed patch the CSN is a timestamp.).
>
> As far as I read Clock-SI paper, we take care of the clock skew by
> putting some waits on the transaction start and reading tuples on the
> remote node.
>

Oh, but I am not sure if this patch is able to solve that, and if so, how?

> >
> > I think InDoubt status helps in checking visibility in the proposed
> > patch wherein if we find the status of the transaction as InDoubt, we
> > wait till we get some valid CSN for it as explained in my previous
> > email. So whether we use it for Rollback/Rollback Prepared, it is
> > required for this design.
>
> Yes, InDoubt status is required for checking visibility. My comment
> was it's not necessary from the perspective of atomic commit.
>

True and probably we can enhance your patch for InDoubt status if required.

Thanks for moving this work forward. I know the progress is a bit
slow due to various reasons but I think it is important to keep making
some progress.

[1] - https://www.postgresql.org/message-id/f23083b9-38d0-6126-eb6e-091816a78585%40postgrespro.ru

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2020-07-13 11:18:52 Re: POC and rebased patch for CSN based snapshots
Previous Message Pavel Stehule 2020-07-13 11:02:32 Re: proposal: possibility to read dumped table's name from file