| From: | Michael Paquier <michael(at)paquier(dot)xyz> |
|---|---|
| To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
| Cc: | John H <johnhyvr(at)gmail(dot)com>, Srinath Reddy Sadipiralla <srinath2133(at)gmail(dot)com>, wenhui qiu <qiuwenhuifx(at)gmail(dot)com>, Japin Li <japinli(at)hotmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Justin Kwan <justinpkwan(at)outlook(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, vignesh ravichandran <admin(at)viggy28(dot)dev>, "hlinnaka(at)iki(dot)fi" <hlinnaka(at)iki(dot)fi> |
| Subject: | Re: Making pg_rewind faster |
| Date: | 2025-10-28 04:02:06 |
| Message-ID: | aQBAPv1Gk6uYVDTd@paquier.xyz |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Thu, Oct 23, 2025 at 08:40:14AM -0400, Robert Haas wrote:
> While I'm not against cross-checking against the control file, this
> sounds like an imaginary scenario to me. That is, it would only happen
> if somebody maliciously modified the contents of the data directory by
> hand with the express goal of breaking the tool. But we fundamentally
> cannot defend against a malicious user whose express goal is to break
> the tool, and I do not see any compelling reason to expend energy on
> it even in cases like this where we could theoretically detect it
> without much effort. If we go down that path, we'll end up not only
> complicating the code, but also obscuring our own goals: it will look
> like we've either done too much sanity checking (because we will have
> added checks that are unnecessary with a non-malicious user) or too
> little (because we will not have caught all the things a malicious
> user might do).
I was thinking about this argument over the weekend, and I am
wondering if we could not do better here to detect if a file should be
copied or not. What if we included a checksum of each file if both
exist on the target and source, and just not copy them if the
checksums match? You cannot do that for relation files when the
source is online, of course, but for files like the oldest segments
before the divergence point, that's better than checking the size,
still more expensive due to the cost of the checksum computation.
And there is a sha256() available at SQL level.
Just throwing one idea in the bucket of ideas. That may not be worth
the extra cost here, of course, but attaching a checksum to
file_entry_t is not what I would qualify as an invasive change.
--
Michael
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Michael Paquier | 2025-10-28 04:06:53 | Re: AIO writes vs hint bits vs checksums |
| Previous Message | Michael Paquier | 2025-10-28 03:51:27 | Re: Bug in pg_stat_statements |