Re: pg_basebackup misses to report checksum error

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Ashwin Agrawal <aagrawal(at)pivotal(dot)io>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_basebackup misses to report checksum error
Date: 2020-05-06 22:02:26
Message-ID: CA+TgmobcBb_Oih2p_gYKsMAw7Df11e0TKFyuda-krzp_F9aG3Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, May 6, 2020 at 5:48 PM Ashwin Agrawal <aagrawal(at)pivotal(dot)io> wrote:
> If pg_basebackup is not able to read BLCKSZ content from file, then it
> just emits a warning "could not verify checksum in file "____" block
> X: read buffer size X and page size 8192 differ" currently but misses
> to error with "checksum error occurred". Only if it can read 8192 and
> checksum mismatch happens will it error in the end.

I don't think it's a good idea to conflate "hey, we can't checksum
this because the size is strange" with "hey, the checksum didn't
match". Suppose the a file has 1000 full blocks and a partial block.
All 1000 blocks have good checksums. With your change, ISTM that we'd
first emit a warning saying that the checksum couldn't be verified,
and then we'd emit a second warning saying that there was 1 checksum
verification failure, which would also be reported to the stats
system. I don't think that's what we want. There might be an argument
for making this code trigger...

ereport(ERROR,
(errcode(ERRCODE_DATA_CORRUPTED),
errmsg("checksum verification failure during base backup")));

...but I wouldn't for that reason inflate the number of blocks that
are reported as having failures.

YMMV, of course.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2020-05-06 22:28:46 Re: do {} while (0) nitpick
Previous Message Ashwin Agrawal 2020-05-06 21:48:20 pg_basebackup misses to report checksum error