| From: | Andrew Kim <tenistarkim(at)gmail(dot)com> |
|---|---|
| To: | John Naylor <johncnaylorls(at)gmail(dot)com> |
| Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org, Oleg Tselebrovskiy <o(dot)tselebrovskiy(at)postgrespro(dot)ru> |
| Subject: | Re: Proposal for enabling auto-vectorization for checksum calculations |
| Date: | 2025-10-24 07:48:45 |
| Message-ID: | CAK64mnejn9AZMYz03e7HX8Uui35PihUuOy=b+iBG=YtRKx0Log@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi John,
Thank you for your review on the previous patch versions.
I've carefully addressed your concerns and those raised by Oleg,
specifically focusing on patch separation and simplification of the
configure tests. I am submitting the new version (V8) as two distinct
patches:
V8-0001: Pure refactoring (moving files, updating includes).
V8-0002: Adding the AVX2 feature (detection, dispatch, and optimization).
As requested, I've used in-line responses below to clarify how each
point was handled.
On Mon, Oct 20, 2025 at 8:30 PM John Naylor <johncnaylorls(at)gmail(dot)com> wrote:
>
> On Fri, Oct 17, 2025 at 2:15 PM Andrew Kim <tenistarkim(at)gmail(dot)com> wrote:
> >
> > Hi John,
> >
> > Thank you for your detailed and constructive feedback on the checksum
> > AVX2 optimization patch.
> > I've carefully addressed all of your concerns and am pleased to share
> > the updated V6 implementation.
>
> Great! I know we're on v7 now, but I'm going to make a request for
> next time you respond to a review: Respond in-line to each point. As I
> mentioned before,
>
> > On Wed, Oct 1, 2025 at 10:26 PM John Naylor <johncnaylorls(at)gmail(dot)com> wrote:
> > > (BTW, we discourage top-posting and prefer to cut to size and
> > > use inline responses)
>
> Please don't top-post again, as it clutters our archives in addition
> to making it easy to forget things. I'm now going to copy the things
> that were either not addressed or misunderstood:
>
I apologize for the top-posting in the previous response. I've
switched to the preferred in-line response format for this and all
future correspondence.
> > > I think a good first refactoring patch would be to move
> > > src/backend/storage/checksum.c (which your patch doesn't even touch)
> > > to src/port (and src/include/storage/checksum.h to src/include/port)
> > > and have all callers use that. With that, I imagine only that
> > > checksum.c file would include checksum_impl.h.
> > >
> > > If that poses a problem, let us know -- we may have to further juggle
> > > things. If that works without issue, we can proceed with the
> > > specialization.
>
> That means the first patch moves things around without adding any
> platform-specific code, and the next patch adds the specialization. I
> think that would be a lot easier to review and test, especially to
> avoid breaking external programs (see below for more on this). A
> committer can always squash things together if it make sense to do so.
>
Patch V8-0001 (Move-checksum-functions...): This is now a pure
refactoring patch. It simply moves checksum.c and its headers from
storage/ to port/ and updates the #include paths in all callers
(rawpage.c, pg_checksums.c, etc.). It contains no AVX2 or ISA-specific
code.
Patch V8-0002 (Add-AVX2-optimization...): This patch builds upon the
first, adding all the new AVX2 functionality, detection, and dispatch
logic.
> > > + #if defined(__has_attribute) && __has_attribute (target)
> > > + __attribute__((target("avx2")))
> > > + #endif
> > > + static int avx2_test(void)
> > > + {
> > > + const char buf@<:@sizeof(__m256i)@:>@;
> > > + __m256i accum = _mm256_loadu_si256((const __m256i *) buf);
> > > + accum = _mm256_add_epi32(accum, accum);
> > > + int result = _mm256_extract_epi32(accum, 0);
> > > + return (int) result;
> > > + }],
> > >
> > > If we're just testing if the target works, we can just use an empty
> > > function, right?
>
> Oleg mentioned the same thing later. It's a waste of time for us to
> repeat ourselves. I said you didn't have to worry about it yet,
> because I was hoping to see the refactoring first.
>
I have implemented this simplification in Patch V8-0002. The test in
config/c-compiler.m4 is now a simple, empty function with only the
__attribute__((target("avx2"))) to verify compiler support for the
attribute, as suggested.
> Now, aside from that I looked further into this:
>
> > > The top of the checksum_impl.h has this:
> > >
> > > * This file exists for the benefit of external programs that may wish to
> > > * check Postgres page checksums. They can #include this to get the code
> > > * referenced by storage/checksum.h. (Note: you may need to redefine
> > > * Assert() as empty to compile this successfully externally.)
> > >
> > > It's going to be a bit tricky to preserve this ability while allowing
> > > the core server and client programs to dispatch to a specialized
> > > implementation, but we should at least try. That means keeping
> > > pg_checksum_block() and pg_checksum_page() where they live now.
>
> Looking at commit f04216341dd1, we have at least one example of an
> external program, pg_filedump. If we can keep this working with
> minimal fuss, it should be fine everywhere.
The v8 patch series preserves external compatibility. External
programs like pg_filedump will only need to update their include
paths:
/* OLD */
#include "storage/checksum.h"
#include "storage/checksum_impl.h"
/* NEW */
#include "port/checksum.h"
#include "port/checksum_impl.h"
The function signatures (pg_checksum_block, pg_checksum_page) remain
identical, and checksum_impl.h still contains the complete
implementation that external programs can include. The runtime
dispatch only affects internal PostgreSQL usage.
/* OLD */#include "storage/checksum.h"#include
"storage/checksum_impl.h"/* NEW */ #include "port/checksum.h"#include
"port/checksum_impl.h"
> https://github.com/df7cb/pg_filedump/blob/master/pg_filedump.c#L29
>
> ```
> /* checksum_impl.h uses Assert, which doesn't work outside the server */
> #undef Assert
> #define Assert(X)
>
> #include "storage/checksum.h"
> #include "storage/checksum_impl.h"
> ```
>
> Elsewhere they already have to do things like
>
> ```
> #if PG_VERSION_NUM < 110000
> " Previous Checkpoint Record: Log File (%u) Offset (0x%08x)\n"
> #endif
> ```
>
> ...so it's probably okay if they have to adjust for a new #include
> path, but I want to verify that actually works, and I don't want to
> make it any more invasive than that. As we proceed, I can volunteer to
> do the work to test that pg_filedump still builds fine with small
> changes. Feel free to try building it yourself, but I'm happy to do
> it.
I appreciate your offer to test pg_filedump compatibility. The changes
in v8 should be minimal for external programs - just the include path
updates. If you're willing to test this, it would be very valuable
validation.
>
> Oleg posted another review recently, so I won't complicate things
> further, but from a brief glance I will suggest for next time not to
> change any comments that haven't been invalidated by the patch.
>
In v8, I've been much more conservative about comment changes. I only
updated comments that were directly invalidated by the code changes
(like file path references that changed from storage/ to port/). Other
comments remain untouched unless they were factually incorrect due to
the refactoring.
> --
> John Naylor
> Amazon Web Services
| Attachment | Content-Type | Size |
|---|---|---|
| v8-0001-Move-checksum-functions-from-backend-storage-to-port.patch | application/octet-stream | 8.8 KB |
| v8-0002-Add-AVX2-optimization-for-page-checksum-calculation.patch | application/octet-stream | 10.7 KB |
| v8-0003-Benchmark-code-for-pg_checksum_bench-performance-tes.patch | application/octet-stream | 4.8 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Hayato Kuroda (Fujitsu) | 2025-10-24 07:54:30 | RE: issue with synchronized_standby_slots |
| Previous Message | Arseniy Mukhin | 2025-10-24 07:36:44 | Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue |