Re: BUG #13750: Autovacuum slows down with large numbers of tables. More workers makes it slower.

From: David Gould <daveg(at)sonic(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #13750: Autovacuum slows down with large numbers of tables. More workers makes it slower.
Date: 2015-10-31 06:41:40
Message-ID: 20151030234140.282542c3@engels
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Fri, 30 Oct 2015 12:51:43 -0400
Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> > Tom Lane wrote:
> >> Good point ... shouldn't we have already checked the stats before ever
> >> deciding to try to claim the table?
>
> > The second check is there to allow for some other worker (or manual
> > vacuum) having vacuumed it after we first checked, but which had
> > finished before we check the array of current jobs.
>
> I wonder whether that check costs more than it saves.

It does indeed. It drives the stats collector wild. And of course if there
are lots of tables and indexes the stats temp file gets very large so that
it can take a long time (seconds) to rewrite it. This happens for each
worker for each table that is a candidate for vacuuming.

Since it would not be convenient to provide a copy of the clients 8TB
database I have made a standalone reproduction. The attached files:

build_test_instance.sh - create a test instance
datagen.py - used by above to populate it with lots of tables
logbyv.awk - count auto analyze actions in postgres log
trace.sh - strace the stats collector and autovacuum workers
tracereport.sh - list top 50 calls in strace output

The test process is to run the build_test_instance script to create an
instance with one database with a large number of tiny tables. During the
setup autovacuuming is off. Then make a tarball of the instance for reuse.
For each test case, untar the instance, set the number of workers and start
it. After a short time autovacuum will start workers to analyze the new
tables. Expect to see the stats collector doing lots of writing.

You may want to use tmpfs or a ramdisk for the data dir for building the
test instance. The configuration is sized for reasonable desktop, 8 to 16GB
of memory and an SSD.

-dg

--
David Gould 510 282 0869 daveg(at)sonic(dot)net
If simplicity worked, the world would be overrun with insects.

Attachment Content-Type Size
build_test_instance.sh application/x-shellscript 4.2 KB
datagen.py text/x-python 2.8 KB
logbyv.awk application/x-awk 1.0 KB
trace.sh application/x-shellscript 531 bytes
tracereport.sh application/x-shellscript 212 bytes

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message David Gould 2015-10-31 07:01:08 Re: BUG #13750: Autovacuum slows down with large numbers of tables. More workers makes it slower.
Previous Message David Gould 2015-10-31 06:19:52 Re: BUG #13750: Autovacuum slows down with large numbers of tables. More workers makes it slower.