Re: Online verification of checksums

From: Andres Freund <andres(at)anarazel(dot)de>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: Michael Banck <michael(dot)banck(at)credativ(dot)de>, Michael Paquier <michael(at)paquier(dot)xyz>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, David Steele <david(at)pgmasters(dot)net>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Online verification of checksums
Date: 2019-02-05 07:01:43
Message-ID: 20190205070143.ntsbcldj22mwwf2d@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2019-02-05 06:57:06 +0100, Fabien COELHO wrote:
> > > > I'm wondering (possibly again) about the existing early exit if one block
> > > > cannot be read on retry: the command should count this as a kind of bad
> > > > block, proceed on checking other files, and obviously fail in the end, but
> > > > having checked everything else and generated a report. I do not think that
> > > > this condition warrants a full stop. ISTM that under rare race conditions
> > > > (eg, an unlucky concurrent "drop database" or "drop table") this could
> > > > happen when online, although I could not trigger one despite heavy testing,
> > > > so I'm possibly mistaken.
> > >
> > > This seems like a defensible judgement call either way.
> >
> > Right now we have a few tests that explicitly check that
> > pg_verify_checksums fail on broken data ("foo" in the file). Those
> > would then just get skipped AFAICT, which I think is the worse behaviour
> > , but if everybody thinks that should be the way to go, we can
> > drop/adjust those tests and make pg_verify_checksums skip them.
> >
> > Thoughts?
>
> My point is that it should fail as it does, only not immediately (early
> exit), but after having checked everything else. This mean avoiding calling
> "exit(1)" here and there (lseek, fopen...), but taking note that something
> bad happened, and call exit only in the end.

I can see both as being valuable (one gives you a more complete picture,
the other a quicker answer in scripts). For me that's the point where
it's the prerogative of the author to make that choice.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Haribabu Kommi 2019-02-05 07:19:38 Re: Memory contexts reset for trigger invocations
Previous Message Fabien COELHO 2019-02-05 05:57:06 Re: Online verification of checksums