Re: Block-level CRC checks

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Aidan Van Dyk <aidan(at)highrise(dot)ca>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Block-level CRC checks
Date: 2009-12-02 00:03:26
Message-ID: 25437.1259712206@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greg Stark <gsstark(at)mit(dot)edu> writes:
> On Tue, Dec 1, 2009 at 10:47 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I don't think "relatively cheap" is the right criterion here --- the
>> question to me is how many assumptions are you making in order to
>> compute the page's CRC. Each assumption degrades the reliability
>> of the check, not to mention creating another maintenance hazard.

> Well the only assumption here is that we know where the line pointers
> start and end.

... and what they contain. To CRC a subset of the page at all, we have
to put some amount of faith into the page header's pointers. We can do
weak checks on those, but only weak ones. If we process different parts
of the page differently, we're increasing our trust in those pointers
and reducing the quality of the CRC check.

> It seems to me adding a third structure on the page and then requiring
> tqual to be able to find that doesn't significantly reduce the
> complexity over having tqual be able to find the line pointers. And it
> significantly increases the complexity of every other part of the
> system which has to deal with a third structure on the page. And
> adding and compacting the page becomes a lot more complex.

The page compaction logic amounts to a grand total of two not-very-long
routines. The vast majority of the code impact from this would be from
the problem of finding the out-of-line hint bits for a tuple, which as
you say appears about equivalently hard either way. So I think keeping
the CRC logic as simple as possible is good from both a reliability and
performance standpoint.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Berkus 2009-12-02 01:00:21 Re: YAML Was: CommitFest status/management
Previous Message Andrew Dunstan 2009-12-01 23:55:00 Re: [CORE] EOL for 7.4?