Re: Online verification of checksums

From: Andres Freund <andres(at)anarazel(dot)de>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Michael Banck <michael(dot)banck(at)credativ(dot)de>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Subject: Re: Online verification of checksums
Date: 2019-03-19 20:49:06
Message-ID: 20190319204906.kglh62lt4yvffjzh@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2019-03-19 13:00:50 -0700, Andres Freund wrote:
> As it stands, the logic seems to give more false confidence than
> anything else.

To demonstrate that I ran a loop that verified that a) a normal backend
query using the tale detects the corruption b) pg_basebackup doesn't.

i=0;
while true; do
i=$(($i+1));
echo attempt $i;
dd if=/dev/urandom of=/srv/dev/pgdev-dev/base/13390/16384 bs=8192 count=1 conv=notrunc 2>/dev/null;
psql -X -c 'SELECT * FROM corruptme;' 2>/dev/null && break;
~/build/postgres/dev-assert/vpath/src/bin/pg_basebackup/pg_basebackup -X fetch -F t -D - -c fast > /dev/null || break;
done

(excuse the crappy one-off sh)

had, during ~12k iterations, always detected the corruption in the
backend, and never via pg_basebackup. Given the likely LSNs in a
cluster, that's not too surprising.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message legrand legrand 2019-03-19 21:00:15 [survey] New "Stable" QueryId based on normalized query text
Previous Message Andres Freund 2019-03-19 20:00:50 Re: Online verification of checksums