Re: Global snapshots

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: "Andrey V(dot) Lepikhov" <a(dot)lepikhov(at)postgrespro(dot)ru>
Cc: movead(dot)li(at)highgo(dot)ca, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Global snapshots
Date: 2020-06-20 12:21:21
Message-ID: CAA4eK1K8ibXN8vknTYvjz1sd+JaDDohvV9jBxSs+=1=KJz=u=w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jun 19, 2020 at 1:42 PM Andrey V. Lepikhov
<a(dot)lepikhov(at)postgrespro(dot)ru> wrote:
>
> On 6/19/20 11:48 AM, Amit Kapila wrote:
> > On Wed, Jun 10, 2020 at 8:36 AM Andrey V. Lepikhov
> > <a(dot)lepikhov(at)postgrespro(dot)ru> wrote:
> >> On 09.06.2020 11:41, Fujii Masao wrote:
> >>> The patches seem not to be registered in CommitFest yet.
> >>> Are you planning to do that?
> >> Not now. It is a sharding-related feature. I'm not sure that this
> >> approach is fully consistent with the sharding way now.
> > Can you please explain in detail, why you think so? There is no
> > commit message explaining what each patch does so it is difficult to
> > understand why you said so?
> For now I used this patch set for providing correct visibility in the
> case of access to the table with foreign partitions from many nodes in
> parallel. So I saw at this patch set as a sharding-related feature, but
> [1] shows another useful application.
> CSN-based approach has weak points such as:
> 1. Dependency on clocks synchronization
> 2. Needs guarantees of monotonically increasing of the CSN in the case
> of an instance restart/crash etc.
> 3. We need to delay increasing of OldestXmin because it can be needed
> for a transaction snapshot at another node.
>

So, is anyone working on improving these parts of the patch. AFAICS
from what Bruce has shared [1], some people from HighGo are working on
it but I don't see any discussion of that yet.

> So I do not have full conviction that it will be better than a single
> distributed transaction manager.
>

When you say "single distributed transaction manager" do you mean
something like pg_dtm which is inspired by Postgres-XL?

> Also, can you let us know if this
> > supports 2PC in some way and if so how is it different from what the
> > other thread on the same topic [1] is trying to achieve?
> Yes, the patch '0003-postgres_fdw-support-for-global-snapshots' contains
> 2PC machinery. Now I'd not judge which approach is better.
>

Yeah, I have studied both the approaches a little and I feel the main
difference seems to be that in this patch atomicity is tightly coupled
with how we achieve global visibility, basically in this patch "all
running transactions are marked as InDoubt on all nodes in prepare
phase, and after that, each node commit it and stamps each xid with a
given GlobalCSN.". There are no separate APIs for
prepare/commit/rollback exposed by postgres_fdw as we do it in the
approach followed by Sawada-San's patch. It seems to me in the patch
in this email one of postgres_fdw node can be a sort of coordinator
which prepares and commit the transaction on all other nodes whereas
that is not true in Sawada-San's patch (where the coordinator is a
local Postgres node, am I right Sawada-San?). OTOH, Sawada-San's
patch has advanced concepts like a resolver process that can
commit/abort the transactions later. I couldn't still get a complete
grip of both patches so difficult to say which is better approach but
I think at the least we should have some discussion.

I feel if Sawada-San or someone involved in another patch also once
studies this approach and try to come up with some form of comparison
then we might be able to make better decision. It is possible that
there are few good things in each approach which we can use.

> Also, I
> > would like to know if the patch related to CSN based snapshot [2] is a
> > precursor for this, if not, then is it any way related to this patch
> > because I see the latest reply on that thread [2] which says it is an
> > infrastructure of sharding feature but I don't understand completely
> > whether these patches are related?
> I need some time to study this patch. At first sight it is different.
>

I feel the opposite. I think it has extracted some stuff from this
patch series and extended the same.

Thanks for the inputs. I feel inputs from you and others who were
involved in this project will be really helpful to move this project
forward.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2020-06-20 12:22:22 Re: Global snapshots
Previous Message Alexander Korotkov 2020-06-20 10:55:33 Re: Operator class parameters and sgml docs