From: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
---|---|
To: | Magnus Hagander <magnus(at)hagander(dot)net>, Michael Banck <michael(dot)banck(at)credativ(dot)de> |
Cc: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, David Steele <david(at)pgmasters(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: pgsql: Validate page level checksums in base backups |
Date: | 2018-04-10 21:35:21 |
Message-ID: | 500ab687-d588-020a-0e23-bfec853aac9c@2ndquadrant.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-committers pgsql-hackers |
On 04/10/2018 11:24 PM, Tomas Vondra wrote:
> Hi,
>
> I think there's a bug in sendFile(). We do check checksums on all pages
> that pass this LSN check:
>
> /*
> * Only check pages which have not been modified since the
> * start of the base backup. Otherwise, they might have been
> * written only halfway and the checksum would not be valid.
> * However, replaying WAL would reinstate the correct page in
> * this case.
> */
> if (PageGetLSN(page) < startptr)
> {
> ...
> }
>
> Now, imagine the page is new, i.e. all-zeroes. That means the LSN is 0/0
> too, and we'll try to verify the checksum - but we actually do not set
> checksums on empty pages.
>
> So I think it should be something like this:
>
> if ((!PageIsNew(page)) && (PageGetLSN(page) < startptr))
> {
> ...
> }
>
> It might be worth verifying that the page is actually all-zeroes (and
> not just with corrupted pd_upper value. Not sure it's worth it.
>
> I've found this by fairly trivial stress testing - running pgbench and
> pg_basebackup in a loop. It was failing pretty reliably (~75% of runs).
> With the proposed change I see no further failures.
>
BTW pg_verify_checksums needs the same fix.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | David Steele | 2018-04-10 21:45:12 | Re: pgsql: Validate page level checksums in base backups |
Previous Message | Tomas Vondra | 2018-04-10 21:24:35 | Re: pgsql: Validate page level checksums in base backups |
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Geoghegan | 2018-04-10 21:40:58 | Re: Gotchas about pg_verify_checksums |
Previous Message | Alvaro Herrera | 2018-04-10 21:32:03 | Re: [HACKERS] Runtime Partition Pruning |