Re: Displaying accumulated autovacuum cost

From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Displaying accumulated autovacuum cost
Date: 2011-11-25 16:39:31
Message-ID: 1322237900-sup-5096@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Excerpts from Greg Smith's message of mié nov 16 04:37:43 -0300 2011:
> On 10/05/2011 03:02 AM, Greg Smith wrote:
> > Presumably you meant to ask if this makes sense to show when cost
> > accounting isn't enabled, because the code doesn't do that right now.
> > No cost accounting, no buffer usage/write rate data as this was
> > submitted.
>
> This is done in the attached update. I just made the page accounting
> happen all the time, regardless of whether the costs were being
> accumulated. Added a read rate too, which is how fast reads happened
> from the OS cache to shared_buffers. Simple test case generates a 600MB
> pgbench_accounts database and wipes out enough to take a while to clean
> up; it needs log_autovacuum_min_duration = 0 and then:
>
> $ createdb pgbench
> $ pgbench -i -s 10 pgbench
> $ psql -d pgbench -c "delete from pgbench_accounts where aid<200000"
>
> LOG: automatic vacuum of table "pgbench.public.pgbench_accounts": index
> scans: 1
> pages: 0 removed, 16394 remain
> tuples: 199999 removed, 640011 remain
> buffer usage: 13742 hits, 2708 misses, 1058 dirtied
> avg read rate: 3.067 MiB/s, avg write rate: 1.198 MiB/s
> system usage: CPU 0.05s/0.61u sec elapsed 6.89 sec

I was about to commit this when I noticed that the avg values may not be
all that representative of reality after all; consider that it's
computed across the whole duration of the vacuuming operation, including
the index scans ... it'd be possibly useful to keep separate timings for
the heap scan (which is likely to use I/O more quickly) from index
scans. That way you can tune for the max, not a possibly misleading
average. That's a much larger change though, so I'm not going to get
into it.

Does anybody else think this would be worthwhile? If so we can stick it
into the TODO with an "easy" tag for someone to tackle -- seems like a
useful first project.

One funny thing in the test I did was that the buffer count might add to
a much larger amount than total disk pages:

LOG: automatic vacuum of table "alvherre.public.foo": index scans: 4
pages: 0 removed, 8850 remain
tuples: 666668 removed, 1333332 remain
buffer usage: 14675 hits, 33857 misses, 20274 dirtied
avg read rate: 2.823 MiB/s, avg write rate: 1.690 MiB/s
system usage: CPU 1.26s/8.08u sec elapsed 93.69 sec

The table and index:

alvherre=# select relname, relpages from pg_class where relname like 'foo%';
relname | relpages
---------+----------
foo | 8850
foo_idx | 5487
(2 filas)

My guess is that this is roughly counting three heap scans plus the four
index scans mentioned in the log report (there were so many because I
reduced maintenance_work_mem to its 1 MB minimum):

alvherre=# select 5487 * 4 + 8850 * 3;
?column?
----------
48498
(1 fila)

alvherre=# select 14675 + 33857;
?column?
----------
48532
(1 fila)

My test case was

create table foo (a int);
insert into foo select * from generate_series (1, 2000000);
create index foo_idx on foo (a);
delete from foo where a % 6 in (1,2);

I then checkpointed before autovac had the chance to process the table,
just to see a higher number of pages dirtied. Note how the number of
pages dirtied is also much higher the total number of existing pages!

I'm going to push this now anyway, thanks.

--
Álvaro Herrera <alvherre(at)commandprompt(dot)com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dean Rasheed 2011-11-25 16:45:34 Re: [COMMITTERS] pgsql: Enable CHECK constraints to be declared NOT VALID
Previous Message Dean Rasheed 2011-11-25 16:16:29 Re: [COMMITTERS] pgsql: Enable CHECK constraints to be declared NOT VALID