From: | Michael Banck <michael(dot)banck(at)credativ(dot)de> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Online verification of checksums |
Date: | 2019-03-02 10:45:48 |
Message-ID: | 1551523548.4947.32.camel@credativ.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
Am Freitag, den 01.03.2019, 18:03 -0500 schrieb Robert Haas:
> On Tue, Sep 18, 2018 at 10:37 AM Michael Banck
> <michael(dot)banck(at)credativ(dot)de> wrote:
> > I have added a retry for this as well now, without a pg_sleep() as well.
> > This catches around 80% of the half-reads, but a few slip through. At
> > that point we bail out with exit(1), and the user can try again, which I
> > think is fine?
>
> Maybe I'm confused here, but catching 80% of torn pages doesn't sound
> robust at all.
The chance that pg_verify_checksums hits a torn page (at least in my
tests, see below) is already pretty low, a couple of times per 1000
runs. Maybe 4 out 5 times, the page is read fine on retry and we march
on. Otherwise, we now just issue a warning and skip the file (or so was
the idea, see below), do you think that is not acceptable?
I re-ran the tests (concurrent createdb/pgbench -i -s 50/dropdb and
pg_verify_checksums in tight loops) with the current patch version, and
I am seeing short reads very, very rarely (maybe every 1000th run) with
a warning like:
|1174
|pg_verify_checksums: warning: could not read block 374 in file "data/base/18032/18045": read 4096 of 8192
|pg_verify_checksums: warning: could not read block 375 in file "data/base/18032/18045": read 4096 of 8192
|Files skipped: 2
The 1174 is the sequence number, the first 1173 runs of
pg_verify_checksums only skipped blocks.
However, the fact it shows two warnings for the same file means there is
something wrong here. It was continueing to the next block while I think
it should just skip to the next file on read failures. So I have changed
that now, new patch attached.
Michael
--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax: +49 2166 9901-100
Email: michael(dot)banck(at)credativ(dot)de
credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer
Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz
Attachment | Content-Type | Size |
---|---|---|
online-verification-of-checksums_V12.patch | text/x-patch | 10.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Sergei Kornilov | 2019-03-02 10:49:51 | Re: allow online change primary_conninfo |
Previous Message | Michael Paquier | 2019-03-02 10:44:39 | Re: Looks heap_create_with_catalog ignored the if_not_exists options |