Re: Test of a partition with an incomplete detach has a timing issue

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>, "'amitlangote09(at)gmail(dot)com'" <amitlangote09(at)gmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Test of a partition with an incomplete detach has a timing issue
Date: 2021-05-25 01:12:40
Message-ID: 1176115.1621905160@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Michael Paquier <michael(at)paquier(dot)xyz> writes:
> On Mon, May 24, 2021 at 02:07:12PM -0400, Alvaro Herrera wrote:
>> Maybe we can change the "cancel" query to something like
>> SELECT pg_cancel_backend(pid), somehow_wait_for_detach_to_terminate() FROM d3_pid;
>> ... where maybe that function can check the "state" column in s3's
>> pg_stat_activity row? I'll give that a try.

> Couldn't you achieve that with a small PL function in a way similar to
> what 32a9c0b did, except that you track a different state in
> pg_stat_activity for this PID?

The number of subsequent fixes to 32a9c0b seem to argue against
using that as a model :-(

The experiments I did awhile ago are coming back to me now. I tried
a number of variations on this same theme, and none of them closed
the gap entirely. The fundamental problem is that it's possible
for backend A to complete its transaction, and for backend B (which
is the isolationtester's monitoring session) to observe that A has
completed its transaction, and for B to report that fact to the
isolationtester, and for that report to arrive at the isolationtester
*before A's query result does*. You need some bad luck for that
to happen, like A losing the CPU right before it flushes its output
buffer to the client, but I was able to demonstrate it fairly
repeatably. (IIRC, the reason I was looking into this was that
the clobber-cache-always buildfarm critters were showing such
failures somewhat regularly.)

It doesn't really matter whether B's observation technique involves
locks (as now), or the pgstat activity table, or what. Conceivably,
if we used the activity data, we could have A postpone updating its
state to "idle" until after it's flushed its buffer to the client.
But that would likely break things for other use-cases. Moreover
it still guarantees nothing, really, because we're still at the
mercy of the kernel as to when it will choose to deliver network
packets.

So a completely bulletproof interlock seems out of reach.
Maybe something like what Alvaro's thinking of will get the
failure rate down to an acceptable level for most developers.
A simple "pg_sleep(1)" might have about the same effect for
much less work, though.

I do agree we need to do something. I've had two or three failures
in those test cases in the past few days in my own manual check-world
runs, which is orders of magnitude worse than the previous
reliability.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2021-05-25 01:17:06 Re: Force disable of SSL renegociation in the server
Previous Message Yura Sokolov 2021-05-25 00:58:43 Re: Add PortalDrop in exec_execute_message