Re: BUG #13750: Autovacuum slows down with large numbers of tables. More workers makes it slower.

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, David Gould <daveg(at)sonic(dot)net>, Pg Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #13750: Autovacuum slows down with large numbers of tables. More workers makes it slower.
Date: 2015-10-31 04:49:00
Message-ID: CAMkU=1zQUAV6Zv3O7R5BO8AfJO+LAw7satHYfd+V2t5MO3Bp4w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Fri, Oct 30, 2015 at 8:40 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
>> David Gould wrote:
>>> Anyway, they are not actually vacuuming. They are waiting on the
>>> VacuumScheduleLock. And requesting freshs snapshots from the
>>> stats_collector.
>
>> Oh, I see. Interesting. Proposals welcome. I especially dislike the
>> ("very_expensive") pgstat check.
>
> Couldn't we simply move that out of the locked stanza? That is, if no
> other worker is working on the table, claim it, and release the lock
> immediately. Then do the "very expensive" check. If that fails, we
> have to re-take the lock to un-claim the table, but that sounds OK.

The attached patch does that. In a system with 4 CPUs and that had
100,000 tables, with a big chunk of them in need of vacuuming, and
with 30 worker processes, this increased the throughput by a factor of
40. Presumably it will do even better with more CPUs.

It is still horribly inefficient, but 40 times less so.

Cheers,

Jeff

Attachment Content-Type Size
vac_move_lock.patch text/x-patch 1.9 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Jeff Janes 2015-10-31 05:16:04 Re: BUG #13750: Autovacuum slows down with large numbers of tables. More workers makes it slower.
Previous Message David Gould 2015-10-31 04:23:17 Re: BUG #13750: Autovacuum slows down with large numbers of tables. More workers makes it slower.