Re: BUG #15492: pg_cancel_backend(pg_backend_pid()) returns true sporadically

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: Alexander Law <exclusion(at)gmail(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #15492: pg_cancel_backend(pg_backend_pid()) returns true sporadically
Date: 2018-11-09 14:48:49
Message-ID: CABUevEzPW_OrC2nry0puBWhhz_e4+kx-Tv6=nsJGdJcGjzWUxQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Fri, Nov 9, 2018 at 2:25 AM Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
wrote:

> On Thu, Nov 8, 2018 at 11:31 PM Magnus Hagander <magnus(at)hagander(dot)net>
> wrote:
> > On Thu, Nov 8, 2018 at 5:31 AM PG Bug reporting form <
> noreply(at)postgresql(dot)org> wrote:
> >>
> >> The following bug has been logged on the website:
> >>
> >> Bug reference: 15492
> >> Logged by: Alexander Lakhin
> >> Email address: exclusion(at)gmail(dot)com
> >> PostgreSQL version: 11.0
> >> Operating system: Windows 2012 R2
> >> Description:
> >>
> >> When performing `make standbycheck` I get sporadic failure:
> >>
> >> ============== running regression test queries ==============
> >> test hs_standby_check ... ok
> >> test hs_standby_allowed ... ok
> >> test hs_standby_disallowed ... ok
> >> test hs_standby_functions ... FAILED
> >>
> >> ======================
> >> 1 of 4 tests failed.
> >> ======================
> >>
> >> ***
> >>
> C:/tmp/postgrespro-standard-10.6.1/src/test/regress/expected/hs_standby_functions.out
> Wed
> >> Nov 7 01:14:03 2018
> >> ---
> >>
> C:/tmp/postgrespro-standard-10.6.1/src/test/regress/results/hs_standby_functions.out
> Wed
> >> Nov 7 06:36:47 2018
> >> ***************
> >> *** 37,40 ****
> >>
> >> -- suicide is painless
> >> select pg_cancel_backend(pg_backend_pid());
> >> ! ERROR: canceling statement due to user request
> >> --- 37,44 ----
> >>
> >> -- suicide is painless
> >> select pg_cancel_backend(pg_backend_pid());
> >> ! pg_cancel_backend
> >> ! -------------------
> >> ! t
> >> ! (1 row)
> >> !
> >>
> >> ======================================================================
> >>
> >> In fact, I see the same when I just do in psql (using EnterpriseDB's
> >> PostgreSQL 11 for Windows):
> >>
> >> postgres=# select pg_cancel_backend(pg_backend_pid());
> >> ERROR: canceling statement due to user request
> >> postgres=# select pg_cancel_backend(pg_backend_pid());
> >> ERROR: canceling statement due to user request
> >> postgres=# select pg_cancel_backend(pg_backend_pid());
> >> ERROR: canceling statement due to user request
> >> postgres=# select pg_cancel_backend(pg_backend_pid());
> >> pg_cancel_backend
> >> -------------------
> >> t
> >> (1 row)
> >>
> >>
> >> postgres=# select pg_cancel_backend(pg_backend_pid());
> >> pg_cancel_backend
> >> -------------------
> >> t
> >> (1 row)
> >>
> >>
> >> postgres=# select pg_cancel_backend(pg_backend_pid());
> >> ERROR: canceling statement due to user request
> >> postgres=#
> >>
> >> I couldn't reproduce it on Linux, though.
> >> So if it's an expected behaviour, shouldn't the hs_standby_functions
> check
> >> be fixed?
> >> (I don't understand what is the point of this pg_cancel_backend call.)
> >
> >
> > This is clearly a timing thing.
> >
> > The most common case is that the signal is sent and delivered while the
> pg_cancel_backend() command is still executed. This is probably "always"
> happening on Unix due to how signals work.
> >
> > On Windows, what happens in the case where it returns is that the signal
> is delivered to the "signal thread" (the separate thread handling our
> signal emulation), but that thread is not scheduled to run until the
> pg_cancel_backend() function has actually returned. Thus it returns the
> value and is then canceled.
> >
> > That said, I agree with the question -- what is the point of this?
> pg_cancel_backend(pg_backend_pid()) can surely only ever cancel the
> pg_cancel_backend call itself, so it seems pointless.
> >
> > The *comment* talks about suicide, which indicates that maybe the
> original intention was to use pg_terminate_backend()? But it has also been
> i nthere since 2009, so why is this problem only showing up now?
>
> We saw a variant of this problem on appveyor (a Windows build-bot)
> when testing Daniel's patch to add an optional message (search for
> "timing"), and it was fixed as part of that patch, for the new code in
> that patch:
>
>
> https://www.postgresql.org/message-id/flat/C2C7C3EC-CC5F-44B6-9C78-637C88BD7D14(at)yesql(dot)se
>
> Perhaps other pre-existing tests need similar treatment?
>

Ah yes, that seems to be the same thing, and yes that seem like a
reasonalbe solution. So something like:
+select case
+ when pg_cancel_backend(pg_backend_pid())
+ then pg_sleep(60)
+end;

Alexander, can you check to see if making that change solves the issue on
your machine?

--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/>
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2018-11-09 14:49:47 Re: BUG #15492: pg_cancel_backend(pg_backend_pid()) returns true sporadically
Previous Message Magnus Hagander 2018-11-09 14:43:39 Re: BUG #15492: pg_cancel_backend(pg_backend_pid()) returns true sporadically