Re: backup manifests

From: David Steele <david(at)pgmasters(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>
Subject: Re: backup manifests
Date: 2019-11-22 19:29:17
Message-ID: 228fdaa6-616e-68ed-7681-002eabb9666d@pgmasters.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11/22/19 2:01 PM, Robert Haas wrote:
> On Fri, Nov 22, 2019 at 1:10 PM David Steele <david(at)pgmasters(dot)net> wrote:
>> Well, the maximum amount of data that can be protected with a 32-bit CRC
>> is 512MB according to all the sources I found (NIST, Wikipedia, etc). I
>> presume that's what we are talking about since I can't find any 64-bit
>> CRC code in core or this patch.
>
> Could you give a more precise citation for this?

See:
https://www.nist.gov/system/files/documents/2017/04/26/lrdc_systems_part2_032713.pdf
Search for "The maximum block size"

https://en.wikipedia.org/wiki/Cyclic_redundancy_check
"The design of the CRC polynomial depends on the maximum total length of
the block to be protected (data + CRC bits)", which I took to mean there
are limits.

Here another interesting bit from:
https://en.wikipedia.org/wiki/Mathematics_of_cyclic_redundancy_checks
"Because a CRC is based on division, no polynomial can detect errors
consisting of a string of zeroes prepended to the data, or of missing
leading zeroes" -- but it appears to matter what CRC you are using.
There's a variation that works in this case and hopefully we are using
that one.

This paper talks about appropriate block lengths vs crc length:
http://users.ece.cmu.edu/~koopman/roses/dsn04/koopman04_crc_poly_embedded.pdf
but it is concerned with network transmission and small block lengths.

> "Typically an n-bit CRC applied to a data block of arbitrary length
> will detect any single error burst not longer than n bits, and the
> fraction of all longer error bursts that it will detect is (1 −
> 2^−n)."

I'm not sure how encouraging I find this -- a four-byte error not a lot
and 2^32 is only 4 billion. We have individual users who have backed up
more than 4 billion files over the last few years.

>> This is the basic premise of what we call delta restore which can speed
>> up restores by orders of magnitude.
>>
>> Delta restore is the main advantage that made us decide to require SHA1
>> checksums. In most cases, restore speed is more important than backup
>> speed.
>
> I see your point, but it's not the whole story. We've encountered a
> bunch of cases where the time it took to complete a backup exceeded
> the user's desired backup interval, which is obviously very bad, or
> even more commonly where it exceeded the length of the user's
> "low-usage" period when they could tolerate the extra overhead imposed
> by the backup. A few percentage points is probably not a big deal, but
> a user who has an 8-hour window to get the backup done overnight will
> not be happy if it's taking 6 hours now and we tack 40%-50% on to
> that. So I think that we either have to disable backup checksums by
> default, or figure out a way to get the overhead down to something a
> lot smaller than what current tests are showing -- which we could
> possibly do without changing the algorithm if we can somehow make it a
> lot cheaper, but otherwise I think the choice is between disabling the
> functionality altogether by default and adopting a less-expensive
> algorithm. Maybe someday when delta restore is in core and widely used
> and CPUs are faster, it'll make sense to revise the default, and
> that's cool, but I can't see imposing a big overhead by default to
> enable a feature core doesn't have yet...

OK, I'll buy that. But I *don't* think CRCs should be allowed for
deltas (when we have them) and I *do* think we should caveat their
effectiveness (assuming we can agree on them).

In general the answer to faster backups should be more cores/faster
network/faster disk, not compromising backup integrity. I understand
we'll need to wait until we have parallelism in pg_basebackup to justify
that answer.

Regards,
--
-David
david(at)pgmasters(dot)net

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Dilger 2019-11-22 19:32:48 Re: Assertion failing in master, predicate.c
Previous Message Mark Dilger 2019-11-22 19:22:40 Re: Assertion failing in master, predicate.c