Re: Freeze avoidance of very large table.

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Kevin Grittner <kgrittn(at)ymail(dot)com>
Cc: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Subject: Re: Freeze avoidance of very large table.
Date: 2015-04-22 16:11:34
Message-ID: CA+Tgmoa8-rMNz2YtMo02wRZ06or1guUrwbeUy8Lbrkvz2L7vQg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 22, 2015 at 11:09 AM, Kevin Grittner <kgrittn(at)ymail(dot)com> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> It's possible that we could use this infrastructure to freeze
>> more aggressively in other circumstances. For example, perhaps
>> VACUUM should freeze any page it intends to mark all-visible.
>> That's not a guaranteed win, because it might increase WAL
>> volume: setting a page all-visible does not emit an FPI for that
>> page, but freezing any tuple on it would, if the page hasn't
>> otherwise been modified since the last checkpoint. Even if that
>> were no issue, the freezing itself must be WAL-logged. But if we
>> could somehow get to a place where all-visible => frozen, then
>> autovacuum would never need to visit all-visible pages, a huge
>> win.
>
> That would eliminate full-table scan vacuums, right? It would do
> that by adding incremental effort and WAL to the "normal"
> autovacuum run to eliminate the full table scan and the associated
> mass freeze WAL-logging? It's hard to see how that would not be an
> overall win.

Yes and yes.

In terms of an overall win, this design loses when the tuples that
have been recently marked all-visible are going to get updated again
in the near future. In that case, the effort we spend to freeze them
is wasted. I just tested "pgbench -i -s 40 -n" followed by "VACUUM"
or alternatively followed by "VACUUM FREEZE". The VACUUM generated
4641kB of WAL. The VACUUM FREEZE generated 515MB of WAL - that is,
113 times more. So changing every VACUUM to act like VACUUM FREEZE
would be quite expensive. We'll still come out ahead if those tuples
are going to stick around long enough that they would have eventually
gotten frozen anyway, but if they get deleted again the loss is pretty
significant.

Incidentally, the reason for the large difference is that when Heikki
created the visibility map, it wasn't necessary for the WAL records
that set the visibility map bits to bump the page LSN, because it was
just a hint anyway. When I made the visibility-map crash-safe, I went
to some pains to preserve that property. Therefore, a regular VACUUM
does not emit full page images for the heap pages - it does for the
visibility map pages themselves, but there aren't very many of those.
In this example, the relation itself was 512MB, so you can see that
adding freezing to the mix roughly doubles the I/O cost. Either way
we have to write half a gig of dirty data pages, but in one case we
also have to write an additional half a gig of WAL.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2015-04-22 16:16:43 Re: Turning off HOT/Cleanup sometimes
Previous Message Stephen Frost 2015-04-22 16:02:54 Re: Row security violation error is misleading