Cost-Based Vacuum Delay tuning

From: Guillaume Cottenceau <gc(at)mnc(dot)ch>
To: pgsql-performance(at)postgresql(dot)org
Subject: Cost-Based Vacuum Delay tuning
Date: 2007-12-07 10:50:19
Message-ID: 87y7c6aidw.fsf@messaging.mobileway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,

I'm currently trying to tune the Cost-Based Vacuum Delay in a
8.2.5 server. The aim is to reduce as much as possible the
performance impact of vacuums on application queries, with the
background idea of running autovacuum as much as possible[1].

My test involves vacuuming a large table, and measuring the
completion time, as the vacuuming proceeds, of a rather long
running application query (involving a table different from the
one being vacuumed) which cannot fit entirely in buffers (and the
completion time of the vacuum, because it needs not be too slow,
of course).

I ran my tests with a few combinations of
vacuum_cost_delay/vacuum_cost_limit, while keeping the other
parameters set to the default from the 8.2.5 tarball:

vacuum_cost_page_hit = 1
vacuum_cost_page_miss = 10
vacuum_cost_page_dirty = 20

The completion time of the query is about 16 seconds in
isolation. With a vacuuming proceeding, here are the results:

vacuum_cost_delay/vacuum_cost_limit (deactivated) 20/200 40/200 100/1000 150/1000 200/1000 300/1000

VACUUM ANALYZE time 54 s 112 s 188 s 109 s 152 s 190 s 274 s
SELECT time 50 s 28 s 26 s 24 s 22 s 20 s 19 s

I have noticed that others (Alvaro, Joshua) suggest to set
vacuum_cost_delay as low as 10 or 20 ms, however in my situation
I'd like to lower the performance impact in application queries
and will probably choose 150/1000 where "only" a +40% is seen -
I'm curious if anyone else has followed the same path, or is
there any outstanding flaw I've missed here? I'm talking
outstanding, as of course any local decision may be different in
the hope of favouring a different database/application behaviour.

Other than that, it's the results obtained with the design
principle of Cost-Base Vacuum Delay, which I find a little
surprising. Of course, I think it has been thought through a lot,
and my observations are probably naive, but I'm going to throw my
ideas anyway, who knows.

I'd think that it would be possible to lower yet again the impact
of vacuuming on other queries, while keeping a vacuuming time
with little overhead, if dynamically changing the delays related
to database activity, rather than using fixed costs and delays.
For example, before and after each vacuum sleep delay is
completed, pg could:

- check the amount of currently running queries
(pg_stat_activity), and continue sleeping if it is above a
configured threshold; by following this path, databases with
peak activities could use a threshold of 1 and have zero
ressource comsumption for vacuuming during peaks, still having
nearly no time completion overhead for vacuuming out of peaks
(since the check is performed also before the sleep delay,
which would be deactivated if no queries are running); if we
can afford a luxury implementation, we could always have a
maximum sleep time configuration, which would allow vacuuming
to proceed a little bit even when there's no timeframe with low
enough database activity

- alternatively, pg could make use of some longer term statistics
(load average, IO statistics) to dynamically pause the
vacuuming - this I guess is related to the host OS and probably
more difficult to have working correctly with multiple disks
and/or processes running - however, if you want high
performance from PostgreSQL, you probably won't host other IO
applications on the same disk(s)

While I'm at it, a different Cost-Based Vacuum Delay issue:
VACUUM FULL also follows the Cost-Based Vacuum Delay tunings.
While it makes total sense when you want to perform a query on
another table, it becomes a problem when your query is waiting
for the exclusive lock on the vacuumed table. Potentially, you
will have the vacuuming proceeding "slowly" because of the
Cost-Based Vacuum Delay, and a blocked application because the
application queries are just waiting.

I'm wondering if it would not be possible to dynamically ignore
(or lower, if it makes more sense?) the Cost-Based Vacuum Delay
during vacuum full, if a configurable amount of queries are
waiting for the lock?

(please save yourself from answering "you should never run VACUUM
FULL if you're vacuuming enough" - as long as VACUUM FULL is
available in PostgreSQL, there's no reason to not make it as
practically usable as possible, albeit with low dev priority)

Ref:
[1] inspired by http://developer.postgresql.org/~wieck/vacuum_cost/

--
Guillaume Cottenceau, MNC Mobile News Channel SA, an Alcatel-Lucent Company
Av. de la Gare 10, 1003 Lausanne, Switzerland

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Piotr Gasidło 2007-12-07 10:55:08 Trouble with LEFT JOIN using VIEWS.
Previous Message Simon Riggs 2007-12-07 08:39:20 Re: database tuning