Re: 9.3: summary of corruption detection / checksums / CRCs discussion

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: 9.3: summary of corruption detection / checksums / CRCs discussion
Date: 2012-04-25 16:06:47
Message-ID: CA+TgmoakwoQ2yoE1DfkqF8ykp28sWvD7zBvT-_QwWS4voJv3qQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Apr 24, 2012 at 8:52 PM, Greg Stark <stark(at)mit(dot)edu> wrote:
> On Tue, Apr 24, 2012 at 9:40 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>  For three things, index pages
>> have hint-type changes that are not single-bit changes.
>
> ? Just how big are these? Part of the reason hint bit updates are safe
> is because one bit definitely absolutely has to be entirely in one
> page. You can't tear a page in the middle of a bit. In reality the
> size is much larger, probably 4k and almost certainly at least 512
> bytes. But the postgres block layout doesn't really offer much
> guarantees about the location of anything relative those 512 byte
> blocks so probably anything larger than a word is unsafe to update.

See _bt_killitems. It uses ItemIdMarkDead, which looks like it will
turn into a 4-byte store.

> The main problem with the approach was that we kept finding more hint
> bits we had forgotten about. Once the coding idiom was established it
> seems it was a handy hammer for a lot of problems.

It is. And I think we shouldn't be lulled into the trap of thinking
hint bits are bad. They do cause some problems, but they exist
because they solve even worse problems. It's fundamentally pretty
useful to be able to cache the results of expensive calculations in
data pages, which is what hints allow us to do, and they let us do it
without incurring the overhead of WAL-logging. Even if we could find
a way of making CLOG access cheap enough that we didn't need
HEAP_XMIN/XMAX_COMMITTED, it wouldn't clear the way to getting rid of
hinting entirely. I strongly suspect that the btree item-is-dead
hinting is actually MORE valuable than the heap hint bits. CLOG
probes are expensive, but there is room for optimization there through
caching and just because the data set is relatively limited in size.
OTOH, the btree hints potentially save you a heap fetch on the next
trip through, which potentially means a random I/O into a huge table.
That's nothing to sneeze at. It also means that the next index
insertion in the page can potentially prune that item away completely,
allowing faster space re-use. That's nothing to sneeze at, either.

To put that another way, the reason why WAL-logging all hints seems
expensive is because NOT WAL-logging hints is a huge performance
optimization. If we can come up with an even better performance
optimization that also reduces the need to write out hinted pages,
then of course we should do that, but we shouldn't hate the
optimization we have because it's not as good as the one we wish we
had.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2012-04-25 16:08:05 Re: Temporary tables under hot standby
Previous Message Robert Haas 2012-04-25 15:49:23 Re: Temporary tables under hot standby