Re: backup manifests

From: David Steele <david(at)pgmasters(dot)net>
To: Stephen Frost <sfrost(at)snowman(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Suraj Kharage <suraj(dot)kharage(at)enterprisedb(dot)com>, tushar <tushar(dot)ahuja(at)enterprisedb(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>, Tels <nospam-pg-abuse(at)bloodgate(dot)com>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>
Subject: Re: backup manifests
Date: 2020-03-27 20:39:29
Message-ID: a56b3472-4501-dbb0-1848-8332b6c3bdf4@pgmasters.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 3/27/20 3:55 PM, Stephen Frost wrote:
> * Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
>> I think that what we have seen so far is that all of the SHA-n
>> algorithms that PostgreSQL supports are about equally slow, so it
>> doesn't really matter which one you pick there from a performance
>> point of view. If you're not saying it has to be SHA-512 but you do
>> want it to be SHA-256, I don't think that really fixes anything. Using
>> CRC-32C does fix the performance issue, but I don't think you like
>> that, either. We could default to having no checksums at all, or even
>> no manifest at all, but I didn't get the impression that David, at
>> least, wanted to go that way, and I don't like it either. It's not the
>> world's best feature, but I think it's good enough to justify enabling
>> it by default. So I'm not sure we have any options here that will
>> satisfy you.
>
> I do like having a manifest by default. At this point it's pretty clear
> that we've just got a fundamental disagreement that more words aren't
> going to fix. I'd rather we play it safe and use a sha256 hash and
> accept that it's going to be slower by default, and then give users an
> option to make it go faster if they want (though I'd much rather that
> alternative be a 64bit CRC than a 32bit one).
>
> Andres seems to agree with you. I'm not sure where David sits on this
> specific question.

I would prefer a stronger checksum as the default but I would be fine
with SHA1, which is a bit faster.

I believe the overhead of checksums is being overblown. In my experience
the vast majority of users are using compression and running the backup
over a network. Once you have done those two things the cost of SHA1 is
pretty negligible. As I posted way up-thread we found that just gzip -6
pushed the cost of SHA1 below 3% and that did not include network transfer.

Regards,
--
-David
david(at)pgmasters(dot)net

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2020-03-27 20:39:54 Re: Reinitialize stack base after fork (for the benefit of rr)?
Previous Message Andres Freund 2020-03-27 20:32:25 Re: backup manifests