Re: WIP checksums patch

From: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To: "'Jeff Davis'" <pgsql(at)j-davis(dot)com>, "'Simon Riggs'" <simon(at)2ndQuadrant(dot)com>
Cc: "'Bruce Momjian'" <bruce(at)momjian(dot)us>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP checksums patch
Date: 2012-10-09 06:51:45
Message-ID: 008001cda5ea$8cdd8330$a6988990$@kapila@huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Monday, October 01, 2012 11:11 PM Jeff Davis wrote:
> On Mon, 2012-10-01 at 18:14 +0100, Simon Riggs wrote:
> > You are missing large parts of the previous thread, giving various
> > opinions on what the UI should look like for enabling checksums.
>
> I read through all of the discussion that I could find. There was quite
> a lot, so perhaps I have forgotten pieces of it.
>
> But that one section in the docs does look out of date and/or confusing
> to me.
>
> I remember there was discussion about a way to ensure that checksums are
> set cluster-wide with some kind of special command (perhaps related to
> VACUUM) and a magic file to let recovery know whether checksums are set
> everywhere or not. That doesn't necessarily conflict with the GUC though
> (the GUC could be a way to write checksums lazily, while this new
> command could be a way to write checksums eagerly).
>
> If some consensus was reached on the exact user interface, can you
> please send me a link?

AFAICT complete consensus has not been reached but one of the discussions can be found on below link:
http://archives.postgresql.org/pgsql-hackers/2012-03/msg00279.php
Here Robert has given suggestions and then further there is more discussion based on that points.

According to me, the main points where more work for this patch is required as per previous discussions is as follows:

1. Performance impact of WAL log for hint-bits needs to be verified for scenario's other than pg_bench (Such as bulk data load (which I
feel there is some way to optimize, but I don't know if that’s part of this patch)).
2. How to put the information in Page header.
I think general direction is use pd_tli.
Storing whether page has checksum should be done or it needs to be maintained at table or db level is not decided.
3. User Interface for Checksum options.
4. Still not there is consensus about locking the buffer.
5. Any more which I have missed?

Apart from above, one of the concern raised by many members is that there should be page format upgrade infrastructure first
and then we should add thinking of checksums(http://archives.postgresql.org/pgsql-hackers/2012-02/msg01517.php).
The major point for upgrade is that it should be an online upgrade and
the problem which I think there is no clear solution yet is hot to ensure that a database will never have more than 2 page formats.

If the general consensus is we should do it without having upgrade, then I think we can pursue discussion about the main points listed above.

With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message John R Pierce 2012-10-09 06:58:51 pgxs problem...
Previous Message Noah Misch 2012-10-09 04:00:49 Re: Visual Studio 2012 RC