Re: Global snapshots

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: Alexey Kondratov <a(dot)kondratov(at)postgrespro(dot)ru>
Cc: "Andrey V(dot) Lepikhov" <a(dot)lepikhov(at)postgrespro(dot)ru>, tsunakawa(dot)takay(at)fujitsu(dot)com, movead(dot)li(at)highgo(dot)ca, 'Amit Kapila' <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Global snapshots
Date: 2020-09-09 17:29:23
Message-ID: b5ea3797-0bcc-7288-ba76-119a423dd693@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2020/09/09 2:00, Alexey Kondratov wrote:
> On 2020-09-08 14:48, Fujii Masao wrote:
>> On 2020/09/08 19:36, Alexey Kondratov wrote:
>>> On 2020-09-08 05:49, Fujii Masao wrote:
>>>> On 2020/09/05 3:31, Alexey Kondratov wrote:
>>>>>
>>>>> Attached is a patch, which implements a plain 2PC in the postgres_fdw and adds a GUC 'postgres_fdw.use_twophase'. Also it solves these errors handling issues above and tries to add proper comments everywhere. I think, that 0003 should be rebased on the top of it, or it could be a first patch in the set, since it may be used independently. What do you think?
>>>>
>>>> Thanks for the patch!
>>>>
>>>> Sawada-san was proposing another 2PC patch at [1]. Do you have any thoughts
>>>> about pros and cons between your patch and Sawada-san's?
>>>>
>>>> [1]
>>>> https://www.postgresql.org/message-id/CA+fd4k4z6_B1ETEvQamwQhu4RX7XsrN5ORL7OhJ4B5B6sW-RgQ@mail.gmail.com
>>>
>>> Thank you for the link!
>>>
>>> After a quick look on the Sawada-san's patch set I think that there are two major differences:
>>
>> Thanks for sharing your thought! As far as I read your patch quickly,
>> I basically agree with your this view.
>>
>>
>>>
>>> 1. There is a built-in foreign xacts resolver in the [1], which should be much more convenient from the end-user perspective. It involves huge in-core changes and additional complexity that is of course worth of.
>>>
>>> However, it's still not clear for me that it is possible to resolve all foreign prepared xacts on the Postgres' own side with a 100% guarantee. Imagine a situation when the coordinator node is actually a HA cluster group (primary + sync + async replica) and it failed just after PREPARE stage of after local COMMIT. In that case all foreign xacts will be left in the prepared state. After failover process complete synchronous replica will become a new primary. Would it have all required info to properly resolve orphan prepared xacts?
>>
>> IIUC, yes, the information required for automatic resolution is
>> WAL-logged and the standby tries to resolve those orphan transactions
>> from WAL after the failover. But Sawada-san's patch provides
>> the special function for manual resolution, so there may be some cases
>> where manual resolution is necessary.
>>
>
> I've found a note about manual resolution in the v25 0002:
>
> +After that we prepare all foreign transactions by calling
> +PrepareForeignTransaction() API. If we failed on any of them we change to
> +rollback, therefore at this time some participants might be prepared whereas
> +some are not prepared. The former foreign transactions need to be resolved
> +using pg_resolve_foreign_xact() manually and the latter ends transaction
> +in one-phase by calling RollbackForeignTransaction() API.
>
> but it's not yet clear for me.
>
>>
>> Implementing 2PC feature only inside postgres_fdw seems to cause
>> another issue; COMMIT PREPARED is issued to the remote servers
>> after marking the local transaction as committed
>> (i.e., ProcArrayEndTransaction()).
>>
>
> According to the Sawada-san's v25 0002 the logic is pretty much the same there:
>
> +2. Pre-Commit phase (1st phase of two-phase commit)
>
> +3. Commit locally
> +Once we've prepared all of them, commit the transaction locally.
>
> +4. Post-Commit Phase (2nd phase of two-phase commit)
>
> Brief look at the code confirms this scheme. IIUC, AtEOXact_FdwXact / FdwXactParticipantEndTransaction happens after ProcArrayEndTransaction() in the CommitTransaction(). Thus, I don't see many difference between these approach and CallXactCallbacks() usage regarding this point.

IIUC the commit logic in Sawada-san's patch looks like

1. PreCommit_FdwXact()
PREPARE TRANSACTION command is issued

2. RecordTransactionCommit()
2-1. WAL-log the commit record
2-2. Update CLOG
2-3. Wait for sync rep
2-4. FdwXactWaitForResolution()
Wait until COMMIT PREPARED commands are issued to the remote servers and completed.

3. ProcArrayEndTransaction()
4. AtEOXact_FdwXact(true)

So ISTM that the timing of when COMMIT PREPARED is issued
to the remote server is different between the patches.
Am I missing something?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2020-09-09 18:02:05 Re: More aggressive vacuuming of temporary tables
Previous Message Tom Lane 2020-09-09 17:16:33 Re: Remove line length restriction in passwordFromFile()