pg_terminate_backend can terminate background workers and autovacuum launchers

From: Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>
To: pgsql-hackers(at)postgresql(dot)org
Subject: pg_terminate_backend can terminate background workers and autovacuum launchers
Date: 2017-06-21 11:56:57
Message-ID: 20170621205657.61d90605.nagata@sraoss.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I have found that we can cancel/terminate autovacuum launchers and
background worker processes by pg_cancel/terminate_backend function.
I'm wondering this behavior is not expected and if not I want to fix it.

The current pg_stat_activity shows background workers and autovacuum
lancher as below. It made me come up with the question.

postgres=# select pid, wait_event, backend_type from pg_stat_activity ;
pid | wait_event | backend_type
-------+---------------------+---------------------
30902 | LogicalLauncherMain | background worker
30900 | AutoVacuumMain | autovacuum launcher
30923 | | client backend
30898 | BgWriterMain | background writer
30897 | CheckpointerMain | checkpointer
30899 | WalWriterMain | walwriter
(6 rows)

We cannot use pg_terminate/cancel_backend for most processes
except client backends. For example, when I tried to terminate
the background writer, I got a warning and failed.

postgres=# select pg_terminate_backend(30899);
WARNING: PID 30899 is not a PostgreSQL server process
pg_terminate_backend
----------------------
f
(1 row)

However, we can terminate background workers by pg_terminate_backend.
In the following example, I terminated the logical replication launcher,
and this process did not appear again[1].

postgres=# select pg_terminate_backend(30902);
pg_terminate_backend
----------------------
t
(1 row)

postgres=# select pid, wait_event, backend_type from pg_stat_activity ;
pid | wait_event | backend_type
-------+-------------------+---------------------
30900 | AutoVacuumMain | autovacuum launcher
30923 | | client backend
30898 | BgWriterHibernate | background writer
30897 | CheckpointerMain | checkpointer
30899 | WalWriterMain | walwriter
(5 rows)

Similarly, we can terminate autovacuum launcher by pg_terminate_backend,
but a new process is restarted by postmaster in this case.[2]

postgres=# select pg_terminate_backend(30900);
pg_terminate_backend
----------------------
t
(1 row)

postgres=# select pid, wait_event, backend_type from pg_stat_activity ;
pid | wait_event | backend_type
-------+-------------------+---------------------
32483 | AutoVacuumMain | autovacuum launcher
30923 | | client backend
30898 | BgWriterHibernate | background writer
30897 | CheckpointerMain | checkpointer
30899 | WalWriterMain | walwriter
(5 rows)

My question is whether the behavior of pg_terminate/cancel_backend is
expected. If these functions should succeed only for client backends,
we need to fix the behavior. Attached is a patch to fix it in that case.

In my patch, process type is checked in pg_signal_backend(), and if it is
background worker or autovacuum launcher then throw a warning and fail.
BackendPidGetProc() returns valid PGPROC for proccesses that are initialized
by PostgresInit(), and, in my understand, all such proccess are client
backends, background workers, and autovacuum launcher. So, if this is
neither background woker nor autovacuum launcher, this should be
a normal client backend. For this check, I added a new field,
isAutoVacuumLauncher, to PGPROC.

Any comments would be appreciated.

-----
[1]
AFAIK, we have to restart the server to enable logical replication after this.
I'm not sure this is expected, but I found the following comment in
ProcessInterrupts(). Does "can be stopped at any time" mean that we can
drop this process completely?

2852 else if (IsLogicalLauncher())
2853 {
2854 ereport(DEBUG1,
2855 (errmsg("logical replication launcher shutting down")));
2856
2857 /* The logical replication launcher can be stopped at any time. */
2858 proc_exit(0);
2859 }

When the logical replication launcher receive SIGTERM, this exits with exitstatus 0,
so this is not restarted by the postmaster.

[2]
On the other hand, when we use pg_cancel_backend for autovacuum launcher,
it causes the following error. I'll report the detail in another thread.

ERROR: can't attach the same segment more than once

-----

Regards,

--
Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>

Attachment Content-Type Size
pg_terminate_backend.pach application/octet-stream 2.3 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Yugo Nagata 2017-06-21 12:15:38 Autovacuum launcher occurs error when cancelled by SIGINT
Previous Message Alexander Kuzmenkov 2017-06-21 11:48:38 Re: Proposal for CSN based snapshots