Re: Too many autovacuum workers spawned during forced auto-vacuum

From: Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Too many autovacuum workers spawned during forced auto-vacuum
Date: 2017-01-20 06:40:44
Message-ID: CAJ3gD9dwH3AOFgf8J4UGg6jxaXUJV-mRB+QmKSuX0bQYJGAo9A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 18 January 2017 at 02:32, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Fri, Jan 13, 2017 at 8:45 AM, Alvaro Herrera
> <alvherre(at)2ndquadrant(dot)com> wrote:
>> I think this is the same problem as reported in
>> https://www.postgresql.org/message-id/CAMkU=1yE4YyCC00W_GcNoOZ4X2qxF7x5DUAR_kMt-Ta=YPyFPQ@mail.gmail.com
>
> If I understand correctly, and it's possible that I don't, the issues
> are distinct. I think that the issue in that thread has to do with
> the autovacuum launcher starting workers over and over again in a
> tight loop, whereas this issue seems to be about autovacuum workers
> restarting the launcher over and over again in a tight loop. In that
> thread, it's the autovacuum launcher that is looping, which can only
> happen when autovacuum=on. In this thread, the autovacuum launcher is
> repeatedly exiting and getting restarted, which can only happen when
> autovacuum=off.
Yes, that's true : in the other thread, autovacuum is on. Although, I
haven't been able to get why there would there be a storm of workers
spawned in case of autovacuum on. When it is on, the launcher starts
worker only it's time to start the worker.

>
> I would be tempted to install something directly in postmaster.c. If
> CheckPostmasterSignal(PMSIGNAL_START_AUTOVAC_LAUNCHER) && Shutdown ==
> NoShutdown but we last set start_autovac_launcher = true less than 10
> seconds ago, don't do it again.

My impression was that postmaster is supposed to just do a minimal
work of starting auto-vacuum launcher if not already. And, the work of
ensuring all the things keep going is the job of auto-vacuum launcher.

> That limits us to launching the
> autovacuum launcher at most six times a minute when autovacuum = off.
> You could argue that defeats the point of the SendPostmasterSignal in
> SetTransactionIdLimit, but I don't think so. If vacuuming the oldest
> database took less than 10 seconds, then we won't vacuum the
> next-oldest database until we hit the next 64kB transaction ID
> boundary, but that can only cause a problem if we've got so many
> databases that we don't get to them all before we run out of
> transaction IDs, which is almost unthinkable. If you had a ten
> million tiny databases that all crossed the threshold at the same
> instant, it would take you 640 million transaction IDs to visit them
> all. If you also had autovacuum_freeze_max_age set very close to the
> upper limit for that variable, you could conceivably have the system
> shut down before all of those databases were reached. But that's a
> pretty artificial scenario. If someone has that scenario, perhaps
> they should consider more sensible configuration choices.

Yeah this logic makes sense ...

But I guess , from looking at the code, it seems that it was carefully
made sure that in case of auto-vacuum off, we should clean up all
databases as fast as possible with multiple workers cleaning up
multiple tables in parallel.

Instead of autovacuum launcher and worker together making sure that
the cycle of iterations keep on running, I was thinking the
auto-vacuum launcher itself should make sure it does not spawn another
worker on the same database if it did nothing. But that seemed pretty
invasive.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2017-01-20 07:09:51 Re: Patch to implement pg_current_logfile() function
Previous Message Andres Freund 2017-01-20 06:15:20 Re: Declarative partitioning - another take