Re: Autovacuum launcher process launches worker process at high frequency

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Autovacuum launcher process launches worker process at high frequency
Date: 2016-10-05 15:11:27
Message-ID: CAMkU=1x3q6HoUii+Yc=NNJX9DkHNKcCgn_WmtRWuNLPhthLReA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Oct 5, 2016 at 7:28 AM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
wrote:

> Hi all,
>
> I found the kind of strange behaviour of the autovacuum launcher
> process when XID anti-wraparound vacuum.
>
> Suppose that a database (say test_db) whose age of frozenxid is about
> to reach max_autovacuum_max_age has three tables T1 and T2.
> T1 is very large and is frequently updated, so vacuum takes long time
> for vacuum.
> T2 is static and already frozen table, thus vacuum can skip to vacuum
> whole table.
> And anti-wraparound vacuum was already executed on other databases.
>
> Once the age of datfrozenxid of test_db exceeded
> max_autovacuum_max_age, autovacuum launcher launches worker process in
> order to do anti-wraparound vacuum on testdb.
> A worker process assigned to test_db begins to vacuum T1, it takes long
> time.
> Meanwhile another worker process is assigned to test_db and completes
> to vacuum on T2 and exits.
>
> After for while, the autovacuum launcher launches new worker again and
> worker is assigned to test_db again.
> But that worker exits quickly because there is no table we need to
> vacuum. (T1 is being vacuumed by another worker process).
> When new worker process starts, worker process sends SIGUSR2 signal to
> launcher process to wake up him.
> Although the launcher process executes WaitLatch() after launched new
> worker, it is woken up and launches another new worker process soon
> again.
>

See also this thread, which was never resolved:

https://www.postgresql.org/message-id/flat/CAMkU%3D1yE4YyCC00W_GcNoOZ4X2qxF7x5DUAR_kMt-Ta%3DYPyFPQ%40mail(dot)gmail(dot)com#CAMkU=1yE4YyCC00W_GcNoOZ4X2qxF7x5DUAR_kMt-Ta=YPyFPQ(at)mail(dot)gmail(dot)com

> As a result, launcher process launches new worker process at extremely
> high frequency regardless of autovacuum_naptime, which increase cpu
> use rate.
>
> Why does auto vacuum worker need to wake up launcher process after started?
>
> autovacuum.c:L1604
> /* wake up the launcher */
> if (AutoVacuumShmem->av_launcherpid != 0)
> kill(AutoVacuumShmem->av_launcherpid, SIGUSR2);
>

I think that that is so that the launcher can launch multiple workers in
quick succession if it has fallen behind schedule. It can't launch them in
a tight loop, because its signals to the postmaster would get merged into
one signal, so it has to wait for one to get mostly set-up before launching
the next.

But it doesn't make any real difference to your scenario, as the
short-lived worker will wake the launcher up a few microseconds later
anyway, when it realizes it has no work to do and so exits.

Cheers,

Jeff

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2016-10-05 15:17:38 Re: pgbench more operators & functions
Previous Message Francisco Olarte 2016-10-05 14:58:21 Re: Question / requests.