From: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: error-handling in ReorderBufferCommit() seems somewhat broken |
Date: | 2017-08-30 00:11:10 |
Message-ID: | 171de158-ab30-edce-0d0e-97022d217196@2ndquadrant.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 08/30/2017 01:34 AM, Andres Freund wrote:
> Hi,
>
> On 2017-08-30 01:27:34 +0200, Tomas Vondra wrote:
>> I've been investigating some failures in test_decoding regression tests,
>> and it seems to me the error-handling in ReorderBufferCommit() is
>> somewhat broken, leading to segfault crashes.
>>
>> The problematic part is this:
>>
>> PG_CATCH()
>> {
>> /*
>> * Force cache invalidation to happen outside of a valid transaction
>> * to prevent catalog access as we just caught an error.
>> */
>> AbortCurrentTransaction();
>>
>> /* make sure there's no cache pollution */
>> ReorderBufferExecuteInvalidations(rb, txn);
>>
>> ...
>>
>> }
>>
>> Triggering it trivial - just add elog(ERROR,...) at the beginning of the
>> PG_TRY() block.
>
> That's not really a valid thing to do, you should put it after the
> BeginInternalSubTransaction()/StartTransactionCommand(). It's basically
> assumed that those won't fail - arguably they should be outside of a
> PG_TRY then, but that's a different matter. If you start decoding
> outside of SQL failing before those will lead to rolling back the parent
> tx...
>
>
>> I suppose this is not quite intentional, but rather an example that
>> error-handling code is an order of magnitude more complicated to write
>> and test. I've only noticed as I'm investigating some regression
>> failures on Postgres-XL 10, which does not support subtransactions and
>> so the BeginInternalSubTransaction() call in the try branch always
>> fails, triggering the issue.
>
> So, IIUC, there's no live problem in postgres core, besides some ugly &
> undocumented assumptions?
>
I'm not really following your reasoning. You may very well be right that
the BeginInternalSubTransaction() example is not quite valid on postgres
core, but I don't see how that implies there can be no other errors in
the PG_TRY block. It was merely an explanation how I noticed this issue.
To make it absolutely clear, I claim that the PG_CATCH() block is
demonstrably broken as it calls AbortCurrentTransaction() and then
accesses already freed memory.
The switch in the PG_TRY() blocks includes multiple elog(ERROR) calls in
the switch, and AFAICS hitting any of them will have exactly the same
effect as failure in BeginInternalSubTransaction(). And I suppose there
many other ways to trigger an error and get into the catch block. In
other words, why would we have the block at all?
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Langote | 2017-08-30 00:13:16 | Re: Tuple-routing for certain partitioned tables not working as expected |
Previous Message | Thomas Munro | 2017-08-29 23:34:28 | Re: A design for amcheck heapam verification |