Re: We probably need autovacuum_max_wraparound_workers

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Josh Berkus <josh(at)agliodbs(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: We probably need autovacuum_max_wraparound_workers
Date: 2012-06-28 05:45:02
Message-ID: CA+TgmoYsEGfT9=41Aa8K74z=7s31V0QZUVcq1QYmJD=6LwNK_A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jun 28, 2012 at 12:51 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> For example, suppose that 26 tables each of which is 4GB in size are
>> going to simultaneously come due for an anti-wraparound vacuum in 26
>> hours.  For the sake of simplicity suppose that each will take 1 hour
>> to vacuum.  What we currently do is wait for 26 hours and then start
>> vacuuming them all at top speed, thrashing the I/O system.
>
> This is a nice description of a problem that has nothing to do with
> reality.  In the first place, we don't vacuum them all at once; we can
> only vacuum max_workers of them at a time.  In the second place, the
> cost-delay features ought to be keeping autovacuum from thrashing the
> I/O, entirely independently of what the reason was for starting the
> vacuums.

I don't think it works that way. The point is that the workload
imposed by autovac is intermittent and spikey. If you configure the
cost limit too low, or the delay too high, or the number of autovac
workers is too low, then autovac can't keep up, which causes all of
your tables to bloat and is a total disaster. You have to make sure
that isn't going to happen, so you naturally configure the settings
aggressively enough that you're sure autovac will be able to stay
ahead of your bloat problem. But then autovac is more
resource-intensive ALL the time, not just when there's a real need for
it. This is like giving a kid a $20 bill to buy lunch and having them
walk around until they find a restaurant sufficiently expensive that
lunch there costs $20. The point of handing over $20 was that you
were willing to spend that much *if needed*, not that the money was
burning a hole in your pocket.

To make that more concrete, suppose that a table has an update rate
such that it hits the autovac threshold every 10 minutes. If you set
the autovac settings such that an autovacuum of that table takes 9
minutes to complete, you are hosed: there will eventually be some
10-minute period where the update rate is ten times the typical
amount, and the table will gradually become horribly bloated. But if
you set the autovac settings such that an autovacuum of the table can
finish in 1 minute, so that you can cope with a spike, then whenever
there isn't a spike you are processing the table ten times faster than
necessary, and now one minute out of every ten carries a heavier I/O
load than the other 9, leading to uneven response times.

It's just ridiculous to assert that it doesn't matter if all the
anti-wraparound vacuums start simultaneously. It does matter. For
one thing, once every single autovacuum worker is pinned down doing an
anti-wraparound vacuum of some table, then a table that needs an
ordinary vacuum may have to wait quite some time before a worker is
available. Depending on the order in which workers iterate through
the tables, you could end up finishing all of the anti-wraparound
vacuums before doing any of the regular vacuums. If the wraparound
vacuums had been properly spread out, then there would at all times
have been workers available for regular vacuums as needed. For
another thing, you can't possibly think that three or five workers
running simultaneously, each reading a different table, is just as
efficient as having one worker grind through them consecutively.
Parallelism is not free, ever, and particularly not here, where it has
the potential to yank the disk head around between five different
files, seeking like crazy, instead of a nice sequential I/O pattern on
each file in turn. Josh wouldn't keep complaining about this if it
didn't suck.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2012-06-28 06:02:23 Re: We probably need autovacuum_max_wraparound_workers
Previous Message Kyotaro HORIGUCHI 2012-06-28 05:04:22 Re: pl/perl and utf-8 in sql_ascii databases