Re: Add client connection check during the execution of the query

From: Andres Freund <andres(at)anarazel(dot)de>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Maksim Milyutin <milyutinma(at)gmail(dot)com>, "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Tatsuo Ishii <ishii(at)sraoss(dot)co(dot)jp>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Zhihong Yu <zyu(at)yugabyte(dot)com>
Subject: Re: Add client connection check during the execution of the query
Date: 2022-01-14 03:35:17
Message-ID: 20220114033517.mps5pdfkxeqeiciw@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2022-01-11 22:59:13 +1300, Thomas Munro wrote:
> I considered another idea we discussed: if we see a latch event, clear
> it and try again so that other events can be revealed (rince and
> repeat), but remember if that happens and set the latch at the end. I
> think that still requires PG_FINALLY() if you want to guarantee not to
> eat a latch event if WaitEventSetWait() throws. This may be a
> theoretical point because things must be pretty broken if
> WaitEventSetWait() is throwing, but I don't like an egregious lack of
> exception safety on principle.

I don't think this is a problem. Not because of WaitEventSetWait() never
throwing, but because it's "just fine" to have reset the latch in that case.

The error will cause control flow to transfer to the next PG_CATCH site. The
point of latches is to avoid racy checks for events (or sleeps with short
timeouts to handle the races). The documented pattern is:

* for (;;)
* {
* ResetLatch();
* if (work to do)
* Do Stuff();
* WaitLatch();
* }

Latches only work if there's a very limited amount of things happening between
the if (work_to_do) and the WaitLatch().

It definitely is *NOT* be ok to do something like:

for (;;)
{
ResetLatch()
if (work_to_do)
DoStuff();

PG_TRY():
something_that_may_throw();
PG_CATCH():
something_not_throwing();

WaitLatch();
}

For one, elog.c related code might have actually done network IO! During which
the latch very well might be reset. So there's just no way any remotely
reasonable code can rely on preserving latch state across errors.

> I considered another idea we discussed: if we see a latch event, clear
> it and try again so that other events can be revealed (rince and
> repeat), but remember if that happens and set the latch at the end.

The more I think about it, the less I see why we *ever* need to re-arm the
latch in pq_check_connection() in this approach. pq_check_connection() is only
used from from ProcessInterrupts(), and there's plenty things inside
ProcessInterrupts() that can cause latches to be reset (e.g. parallel message
processing causing log messages to be sent to the client, causing network IO,
which obviously can do a latch reset).

It makes sense to use CFI() in a latch loop, but it would have to be at the
bottom or top, not between if (work_to_do) and WaitLatch().

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2022-01-14 04:09:40 Re: null iv parameter passed to combo_init()
Previous Message Bharath Rupireddy 2022-01-14 03:25:34 Consistently use the function name CreateCheckPoint instead of CreateCheckpoint in code comments