|From:||Michael Banck <michael(dot)banck(at)credativ(dot)de>|
|To:||David Steele <david(at)pgmasters(dot)net>, pgsql-hackers(at)lists(dot)postgresql(dot)org|
|Cc:||Stephen Frost <sfrost(at)snowman(dot)net>, Magnus Hagander <magnus(at)hagander(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>|
|Subject:||Re: [PATCH] Verify Checksums during Basebackups|
|Views:||Raw Message | Whole Thread | Download mbox | Resend email|
thanks for the review!
Am Donnerstag, den 22.03.2018, 12:22 -0400 schrieb David Steele:
> On 3/17/18 5:34 PM, Michael Banck wrote:
> > On Fri, Mar 09, 2018 at 10:35:33PM +0100, Michael Banck wrote:
> > I think most people (including those I had off-list discussions about
> > this with) were of the opinion that such an option should be there, so I
> > added an additional option NOVERIFY_CHECKSUMS to the BASE_BACKUP
> > replication command and also an option -k / --no-verify-checksums to
> > pg_basebackup to trigger this.
> > Updated patch attached.
> + memcpy(page, (buf + BLCKSZ * i), BLCKSZ);
> Why make a copy here? How about:
> char *page = buf + BLCKSZ * i
> I know pg_checksum_page manipulates the checksum field but I have found
> it to be safe.
> + if (phdr->pd_checksum != checksum)
> I've attached a patch that adds basic retry functionality. It's not
> terrible efficient since it rereads the entire buffer for any block
> error. A better way is to keep a bitmap for each block in the buffer,
> then on retry compare bitmaps. If the block is still bad, report it.
> If the block was corrected moved on. If a block was good before but is
> bad on retry it can be ignored.
I have to admit I find it a bit convoluted and non-obvious on first
reading, but I'll try to check it out some more.
I think it would be very useful if we could come up with a testcase
showing this problem, but I guess this will be quite hard to hit
> + ereport(WARNING,
> + (errmsg("checksum verification failed in file "
> I'm worried about how verbose this warning could be since there are
> 131,072 blocks per segment. It's unlikely to have that many block
> errors, but users do sometimes put files in PGDATA which look like they
> should be validated. Since these warnings all go to the server log it
> could get pretty bad.
We only verify checksums of files in tablespaces, and I don't think
dropping random files in those is supported in any way, as opposed to
maybe the top-level PGDATA directory. So I would say that this is not a
> We should either stop warning after the first failure, or aggregate the
> failures for a file into a single message.
I agree that major corruption could make the whole output blow up but I
would prefer to keep this feature simple for now, which implies possibly
printing out a lot of WARNING or maybe just stopping after the first
one (or first few, dunno).
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax: +49 2166 9901-100
credativ GmbH, HRB Mönchengladbach 12080
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer
|Next Message||Pavan Deolasee||2018-03-23 09:57:15||Re: Faster inserts with mostly-monotonically increasing values|
|Previous Message||Ashutosh Bapat||2018-03-23 09:30:37||Comment update in BuildTupleFromCStrings()|