From: | Magnus Hagander <magnus(at)hagander(dot)net> |
---|---|
To: | Michael Banck <michael(dot)banck(at)credativ(dot)de> |
Cc: | Stephen Frost <sfrost(at)snowman(dot)net>, David Steele <david(at)pgmasters(dot)net>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com> |
Subject: | Re: [PATCH] Verify Checksums during Basebackups |
Date: | 2018-04-03 11:52:21 |
Message-ID: | CABUevEz4GCyrOr_o48tN-Yvpu61WN4vTLO2W46TXvRKi5L04Qw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Apr 2, 2018 at 2:48 PM, Michael Banck <michael(dot)banck(at)credativ(dot)de>
wrote:
> Hi!
>
> On Sun, Apr 01, 2018 at 05:27:21PM +0200, Magnus Hagander wrote:
> > It might be a micro-optimisation, but ISTM that we shouldn't do the
> > basename(palloc(fn)) in is_heap_file() unless we actually need it -- so
> not
> > at the top of the function. (And surely "." and ".." should not occur
> here?
> > I think that's a result of copy/paste from the online checksum patch?
> >
> > We also do exactly the same basename(palloc(fn)) in sendFile(). Can we
> > find a way to reuse that duplication? Perhaps we want to split it into
> the
> > two pieces already out in sendFile() and pass it in as separate
> parameters?
>
> I've done the latter now, although it looks a bit weird that the second
> argument data is a subset of the first. I couldn't find another way to
> not have it done twice, though.
>
I agree, but I think it's still cleaner.
On further look, there is actually no need to pstrdup() at all -- we never
used the modified part of the string anyway, because we don't care about
the oid (unlike pg_verify_checksums).
So I adjusted the patch by that.
> If not then this second one should at least only be done inside the if
> > (verify_checksums).
>
> We can't have both, as we need to call the is_heap_file() function in
> order to determine whether we should verify the checksums.
>
Right. I realize that -- thus the "if not". But I guess I was not clear in
what I meant -- see attached file for it.
> There is a bigger problem next to that though -- I believe basename()
> does
> > not exist on Win32. I haven't tested it, but there is zero documentation
> of
> > it existing, which usually indicates it doesn't. That's the part that
> > definitely needs to get fixed.
> >
> > I think you need to look into the functionality in port/path.c, in
> > particular last_dir_separator()?
>
> Thanks for the pointer, I've used that now; I mentioned before that
> basename() might be a portability hazard, but couldn't find a good
> substitute myself.
>
Yeah, I have a recollection somewhere of running into this before, but I
couldn't find any references. But the complete lack of docs about it on
msdn.microsoft.com is a clear red flag :)
>
> V6 of the patch is attached.
>
>
Excellent. I've done some mangling on it:
* Changed the is_heap_file to is_checksummed_file (and the associtaed
struct name), because this is really what it's about (we for example verify
checksums on indexes, which are clearly not heaps)
* Moved the list of files to the top of the file next to the other lists of
files/directories
* Added missing function prototype at the top, and changed the parameter
names to be a bit more clear
* Added some comments
* Changed the logic around the checksum-check to avoid the pstrdup() and to
not call the path functions unless necessary (per comment above)
* "filen" -> "file" in message
* xlog.h does not need to be included
* pgindent
Remaining question:
The check for (cnt % BLCKSZ != 0) currently does "continue", which means
that this block of data isn't actually sent to the client at all, which
seems completely wrong. We only want to prevent checksum validations.
I have moved the check up a bit, and refactored it so it continues to do
the actual transmission of the file if this path is hit.
I have pushed an updated patch with those changes. Please review the result
and let me know I broke something :)
Thanks!!
--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/>
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>
From | Date | Subject | |
---|---|---|---|
Next Message | Magnus Hagander | 2018-04-03 11:52:26 | pgsql: Validate page level checksums in base backups |
Previous Message | Dmitry Ivanov | 2018-04-03 11:28:37 | Re: new function for tsquery creartion |