| From: | Andy Fan <zhihuifan1213(at)163(dot)com> |
|---|---|
| To: | Michael Paquier <michael(at)paquier(dot)xyz> |
| Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: raise ERROR between EndPrepare and PostPrepare_Locks causes ROLLBACK 2pc PAINC |
| Date: | 2026-03-25 03:27:10 |
| Message-ID: | 874im44mi9.fsf@163.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Michael Paquier <michael(at)paquier(dot)xyz> writes:
Hi,
Thanks for showing interests for this topic!
> On Wed, Mar 25, 2026 at 08:39:07AM +0800, Andy Fan wrote:
>> I found a similar but not exactly same case at 2014 [1] which
>> might be helpful to recall a boarder understanding on this area.
>>
>> [1] https://www.postgresql.org/message-id/534AF601.1030007%40vmware.com
>
> Incorrect shared state when an ERROR happens at an arbitrary location
> is usually bad, yes.
>
> For this one, your suggestion of delaying the end of the critical
> section started at StartPrepare() and ending in EndPrepare() is not an
> acceptable solution as far as I can see, unfortunately: it would mean
> doing a SyncRepWaitForLSN() while in a critical section, and I doubt
> we'd want to do that.
I do have a more general question: (Q1). what is critical section
designed to? (Q2). What is the badness if we put more code into it
except the ERROR->PANIC logic?
Then (Q3) will SyncRepWaitForLSN be possible to raise ERROR? if no for
now and possible in future, then my Q2 raise.
> Anyway, I doubt that this one is worth caring for. The current locking
> 2PC scheme means, as far as I remember, that it is not really possible
> to interact with an external command in a specific session between
> the EndPrepare() and the PostPrepare_Locks()
> calls.
Then Q3 comes. The deeper answer might be Q2 or Q1.
> To put it in other words, let's imagine that we use a breakpoint
> between these two calls (or a wait injection point if you automate
> that). Is it possible for a second backend to mess with the state of
> the first backend waiting until its locks are transfered to the dummy
> PGPROC entry? That's what the 2014 thread is about: there was a race
> condition reachable between two sessions.
This is true, so the issue 2014 thread is more critical than the
current one and which has been fixed.
> If the answer to this question is yes, I'd agree that this is
> something that deserves a closer lookup.
Generally yes.. But I can't stop thinking the Q3 -> Q2 -> Q1 when I want
to accept this asnwer.
> And before you ask: attempting to interact with a 2PC
> state from a second session with a first session waiting between these
> two points would not work: the 2PC entry is locked, cleaned up after
> EndPrepare() and PostPrepare_Locks() at PostPrepare_Twophase().
> Trying to request an access to this entry fails, as the first backend
> is marked as locking it. A second backend attempting to lock it would
> fail, complaining that the 2PC entry with a GXID is "busy".
I can understand what you are saying now, but what does it make
difference on the above case?
> SyncRepWaitForLSN() would be a problematic pattern between the
> EndPrepare() and the PostPrepare_Locks(), but we never ERROR there on
> purpose: even if we cancel while waiting for a transaction commit we'd
> just get a WARNING, meaning that we'd be able to transfer our locks
> anyway.
again Q2 -> Q3.
> Or perhaps you have a realistic scenario where it is possible to mess
> up with the shared state, outside a elog(ERROR) forced between these
> two points?
No really. I just inject some exception on some predefined places. I even
don't know why people defined this places before. As for me, I prefer
to know MORE design points for the CRITIAL SECTION besides what I side
before.
--
Best Regards
Andy Fan
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Bharath Rupireddy | 2026-03-25 03:28:23 | Re: Reduce log level of some logical decoding messages to DEBUG1 |
| Previous Message | Bertrand Drouvot | 2026-03-25 03:25:07 | Re: relfilenode statistics |