Re: autovacuum not prioritising for-wraparound tables

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: autovacuum not prioritising for-wraparound tables
Date: 2013-01-30 09:48:40
Message-ID: 20130130094840.GC5516@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2013-01-29 16:09:52 +1100, Josh Berkus wrote:
>
> > I have to admit, I fail to see why this is a good idea. There isn't much
> > of an efficiency bonus in freezing early (due to hint bits) and vacuums
> > over vacuum_freeze_table_age are considerably more expensive as they
> > have to scan the whole heap instead of using the visibilitymap. And if
> > you don't vacuum the whole heap you can't lower relfrozenxid. So
> > changing freeze_min_age doesn't help at all to avoid anti-wraparound
> > vacuums.
> >
> > Am I missing something?
>
> Yep. First, you're confusing vacuum_freeze_table_age and
> vacuum_freeze_min_age.

Don't think I did. I was talking about vacuum_freeze_table_age because
that influences the amount of full-table scans in contrast to ones using
the vm. Thats an independent thing from anti-wraparound vacuums which
are triggered by autovacuum_freeze_max_age.

The point I was trying to make is that a very big part of the load is
not actually the freezing itself but the full-table vacuums which are
triggered by freeze_table_age.

> Second, you're not doing any arithmatic.

Because its not actually as easy to calculate as you make it seem.

Even in the case of a large insert-only table you have way much more
complex behaviour than what you describe.
The lifetime of tuples/buffers in that context is approx the following:
- inserted
- written by bgwriter or by checkpoint
- vacuum reads the non-all-visible part of the table
- vacuum sets HEAP_XMIN_COMMITTED
- freeze_table_age vacuum reads the whole table
- doesn't find anything because of freeze_min_age
- freeze_table_age vacuum reads the whole table
- freezes tuple because >= freeze_min_age
- freeze_table_age vacuum reads the whole table
- doesn't change anything in our page because its already frozen

So the big point your computation is missing is that all those
anti-wraparound vacuums a) might not even happen due to normal vacuums
being over freeze_table_age which change the relfrozenxid b) don't
rewrite the whole table because the tuples actually are already frozen
c) will be written out a page repeatedly because of tuples that get
changed again d) incur full page writes.

> Let's do this by example. TableA is a large table which receives an
> almost constant stream of individual row updates, inserts, and deletes.
>
> DEFAULTS:
>
> XID 1: First rows in TableA are updated.
> XID 200m: Anti-wraparound autovac of TableA.
> All XIDs older than XID 100m set to FROZENXID.

Between those the table will have been vacuumed already and depending on
the schedule the tuples will already have been vacuumed due to
freeze_min_age being 50mio and freeze_table_age being 150mio. Before
that all the tuples will already have been written another time for hint
bit writes.

> XID 300m: Anti-wraparound autovac of TableA
> All XIDs older than XID 200M set to FROZENXID.

Only the newer tuples are going to be rewritten, the older parts of the
table will only be read.

> XID 400m: Anti-wraparound autovac of TableA
> All XIDs older than XID 300M set to FROZENXID.
> XID 500m: Anti-wraparound autovac of TableA
> All XIDs older than XID 400M set to FROZENXID.
> XID 600m: Anti-wraparound autovac of TableA
> All XIDs older than XID 500M set to FROZENXID.

> vacuum_freeze_min_age = 1m
>
> XID 1: First rows in TableA are updated.
> XID 200m: Anti-wraparound autovac of TableA.
> All XIDs older than XID 199m set to FROZENXID.

Even in an insert-only case the tuples will be written at least twice
before an anti-freeze-wraparound, often thrice:
- first checkpoint
- hint bit sets due to a normal vacuum
- frozen due to a full-table vacuum

But, as you assumed the table will also get deletes and updates the low
freeze age will mean that some tuples on a page will get frozen on each
vacuum that reads the page which incurs a full-page-write everytime the
some tuples are frozen as most of the time the last time the page was
touched will be before the last checkpoint happened. As the WAL is a
major bottleneck on a write-heavy server that can incur a pretty heft
global slowdown.
Its *good* to only freeze tuples once youre pretty damn sure it won't be
touched by actual data changes again. As full-table vacuums happen more
frequently than anti-freeze vacuums anyway the cost of actual
anti-freeze vacuums, should they happen because of a too busy
autovacuum, aren't a problem in itself.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2013-01-30 13:39:29 Re: autovacuum not prioritising for-wraparound tables
Previous Message Marko Tiikkaja 2013-01-30 09:29:10 Re: pg_dump --pretty-print-views