Re: BUG #15492: pg_cancel_backend(pg_backend_pid()) returns true sporadically

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Alexander Lakhin <exclusion(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #15492: pg_cancel_backend(pg_backend_pid()) returns true sporadically
Date: 2018-11-09 01:25:08
Message-ID: CAEepm=16ZLqpPk6LjPVORFrN0ix_zQ1VUM7Hs=iuP5ExGEE3sA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Thu, Nov 8, 2018 at 11:31 PM Magnus Hagander <magnus(at)hagander(dot)net> wrote:
> On Thu, Nov 8, 2018 at 5:31 AM PG Bug reporting form <noreply(at)postgresql(dot)org> wrote:
>>
>> The following bug has been logged on the website:
>>
>> Bug reference: 15492
>> Logged by: Alexander Lakhin
>> Email address: exclusion(at)gmail(dot)com
>> PostgreSQL version: 11.0
>> Operating system: Windows 2012 R2
>> Description:
>>
>> When performing `make standbycheck` I get sporadic failure:
>>
>> ============== running regression test queries ==============
>> test hs_standby_check ... ok
>> test hs_standby_allowed ... ok
>> test hs_standby_disallowed ... ok
>> test hs_standby_functions ... FAILED
>>
>> ======================
>> 1 of 4 tests failed.
>> ======================
>>
>> ***
>> C:/tmp/postgrespro-standard-10.6.1/src/test/regress/expected/hs_standby_functions.out Wed
>> Nov 7 01:14:03 2018
>> ---
>> C:/tmp/postgrespro-standard-10.6.1/src/test/regress/results/hs_standby_functions.out Wed
>> Nov 7 06:36:47 2018
>> ***************
>> *** 37,40 ****
>>
>> -- suicide is painless
>> select pg_cancel_backend(pg_backend_pid());
>> ! ERROR: canceling statement due to user request
>> --- 37,44 ----
>>
>> -- suicide is painless
>> select pg_cancel_backend(pg_backend_pid());
>> ! pg_cancel_backend
>> ! -------------------
>> ! t
>> ! (1 row)
>> !
>>
>> ======================================================================
>>
>> In fact, I see the same when I just do in psql (using EnterpriseDB's
>> PostgreSQL 11 for Windows):
>>
>> postgres=# select pg_cancel_backend(pg_backend_pid());
>> ERROR: canceling statement due to user request
>> postgres=# select pg_cancel_backend(pg_backend_pid());
>> ERROR: canceling statement due to user request
>> postgres=# select pg_cancel_backend(pg_backend_pid());
>> ERROR: canceling statement due to user request
>> postgres=# select pg_cancel_backend(pg_backend_pid());
>> pg_cancel_backend
>> -------------------
>> t
>> (1 row)
>>
>>
>> postgres=# select pg_cancel_backend(pg_backend_pid());
>> pg_cancel_backend
>> -------------------
>> t
>> (1 row)
>>
>>
>> postgres=# select pg_cancel_backend(pg_backend_pid());
>> ERROR: canceling statement due to user request
>> postgres=#
>>
>> I couldn't reproduce it on Linux, though.
>> So if it's an expected behaviour, shouldn't the hs_standby_functions check
>> be fixed?
>> (I don't understand what is the point of this pg_cancel_backend call.)
>
>
> This is clearly a timing thing.
>
> The most common case is that the signal is sent and delivered while the pg_cancel_backend() command is still executed. This is probably "always" happening on Unix due to how signals work.
>
> On Windows, what happens in the case where it returns is that the signal is delivered to the "signal thread" (the separate thread handling our signal emulation), but that thread is not scheduled to run until the pg_cancel_backend() function has actually returned. Thus it returns the value and is then canceled.
>
> That said, I agree with the question -- what is the point of this? pg_cancel_backend(pg_backend_pid()) can surely only ever cancel the pg_cancel_backend call itself, so it seems pointless.
>
> The *comment* talks about suicide, which indicates that maybe the original intention was to use pg_terminate_backend()? But it has also been i nthere since 2009, so why is this problem only showing up now?

We saw a variant of this problem on appveyor (a Windows build-bot)
when testing Daniel's patch to add an optional message (search for
"timing"), and it was fixed as part of that patch, for the new code in
that patch:

https://www.postgresql.org/message-id/flat/C2C7C3EC-CC5F-44B6-9C78-637C88BD7D14(at)yesql(dot)se

Perhaps other pre-existing tests need similar treatment?

--
Thomas Munro
http://www.enterprisedb.com

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Amit Langote 2018-11-09 04:56:36 Re: BUG #15212: Default values in partition tables don't work as expected and allow NOT NULL violation
Previous Message Michael Paquier 2018-11-09 01:20:58 Re: BUG #15492: pg_cancel_backend(pg_backend_pid()) returns true sporadically