Re: [PATCH] Verify Checksums during Basebackups

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Michael Banck <michael(dot)banck(at)credativ(dot)de>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [PATCH] Verify Checksums during Basebackups
Date: 2018-03-05 11:53:28
Message-ID: 20180305115328.GX2416@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Michael,

* Michael Banck (michael(dot)banck(at)credativ(dot)de) wrote:
> Am Montag, den 05.03.2018, 06:36 -0500 schrieb Stephen Frost:
> > * Michael Banck (michael(dot)banck(at)credativ(dot)de) wrote:
> > > On Sun, Mar 04, 2018 at 06:19:00PM +0100, Magnus Hagander wrote:
> > > > So sure, if we go with WARNING + exit with an errorcode, that is perhaps
> > > > the best combination of the two.
> > >
> > > I had a look at how to go about this, but it appears to be a bit
> > > complicated; the first problem is that sendFile() and sendDir() don't
> > > have status return codes that could be set on checksum verifcation
> > > failure. So I added a global variable and threw an ereport(ERROR) at the
> > > end of perform_base_backup(), but then I realized that `pg_basebackup'
> > > the client program purges the datadir it created if it gets an error:
> > >
> > > > pg_basebackup: final receive failed: ERROR: Checksum mismatch during
> > > > basebackup
> > > >
> > > > pg_basebackup: removing data directory "data2"
> >
> > Oh, ugh.
>
> I came up with the attached patch, which sets a checksum_failure
> variable in both basebackup.c and pg_basebackup.c, and emits an ereport
> with (for now) ERRCODE_DATA_CORRUPTED at the end of
> perform_base_backup(), which gets caught in pg_basebackup and then used
> to not cleanup the datadir, but exit with a non-zero exit code.
>
> Does that seem feasible?

Ah, yes, I had thought about using a WARNING or NOTICE or similar also
to pass back the info about the checksum failure during the backup, that
seems like it would work as long as pg_basebackup captures that
information and puts it into a log or on stdout or similar.

I'm a bit on the fence about if we shouldn't just have pg_basebackup
always return a non-zero exit code on a WARNING being seen during the
backup instead. Given that there's a pretty clear SQL code for this
case, perhaps throwing an ERROR and then checking the SQL code isn't
an issue though.

Thanks!

Stephen

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2018-03-05 12:01:39 Re: pgbench randomness initialization
Previous Message Michael Banck 2018-03-05 11:47:17 Re: [PATCH] Verify Checksums during Basebackups