Re: Proposal for enabling auto-vectorization for checksum calculations

From: Andrew Kim <tenistarkim(at)gmail(dot)com>
To: Oleg Tselebrovskiy <o(dot)tselebrovskiy(at)postgrespro(dot)ru>
Cc: John Naylor <johncnaylorls(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Proposal for enabling auto-vectorization for checksum calculations
Date: 2025-10-18 21:30:21
Message-ID: CAK64mnd6VAtTbar=QS4sbWJpfFxjksjg2ERNkGw1tRwh_kc6mw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Oleg,

Thank you very much for the detailed and constructive feedback on v6 patch.
It was extremely helpful in refining the architecture and ensuring
compliance with PostgreSQL coding standards.

I have updated the patch to V7, which I believe addresses all of your
points, including the critical architectural concerns regarding file
organization and linking.

Key Changes and Feedback Resolution in V7

The architecture is now consolidated in the src/port module.
1. Compiler Flags (Unroll/Vectorize)Resolved: Compiler flags
(CFLAGS_UNROLL_LOOPS) are now correctly placed and applied to
checksum.c in src/port/Makefile and src/port/meson.
2. Header OrganizationResolved: checksum.h and checksum_impl.h have
been moved from src/include/storage/ to src/include/port/ for
consistent module organization.
3. External Program CompatibilityResolved: checksum_impl.h is now
fully self-contained. It provides the static inline implementations
(pg_checksum_block_default, pg_checksum_block_avx2) and all required
constants, ensuring external tools can calculate checksums without
linking to the backend library.
4. Duplicate FilesResolved: The redundant
src/backend/storage/page/checksum.c file has been removed,
consolidating all implementation logic into src/port/checksum.c.
5. Function NamingResolved: The dispatch pattern now uses
pg_checksum_block_choose, aligning with the established naming
conventions (e.g., CRC32C module). The implementations use the clear
names pg_checksum_block_default and pg_checksum_block_avx2.
7. Documentation/CommentsResolved: Comprehensive documentation,
including the detailed FNV-1a algorithm comments, has been restored to
the portable implementation (pg_checksum_block_default).

Best regards,
Andrew Kim

On Fri, Oct 17, 2025 at 3:53 AM Oleg Tselebrovskiy
<o(dot)tselebrovskiy(at)postgrespro(dot)ru> wrote:
>
> Greetings!
>
> I've also tried to use AVX2 to speedup checksums and I've found your
> approach quite interesting
>
> But I see some issues with v6 patch
>
> 1) checksum.c was moved to src/port, but special meson rules are left in
> src/backend/storage/page/meson.build. As a result, assembly code for
> moved src/port/checksum.c doesn't use -funroll-loops and
> -ftree-vectorize (latter isn't probably needed now, due to the nature of
> the patch). The same is true for src/port/Makefile, there are no
> instructions to use CFLAGS_UNROLL_LOOPS and CFLAGS_VECTORIZE
>
> 2) checksum.c was moved to src/port, but checksum.h and checksum_impl.h
> are left in src/include/storage. I think they both should be moved to
> src/include/port, as John Naylor suggested in his review of v5
>
> 3) checksum_impl.h now doesn't provide any code, so including it in
> external programs won't allow checksum calculation. I think that all
> code should be in checksum_impl.h, and external programs could just
> define USE_AVX2_WITH_RUNTIME_CHECK (probably using similar checks as we
> are) to use AVX2 implementation. If not - then they will default to
> default realisation
>
> 4) I don't understand why do we need to check for AVX2 intrinsics if we
> don't use those in code (at least I don't see them directly)? As in
> review of v5, couldn't test functions in configure, config/c-compiler.m4
> and ./meson.build just be {return 0;} or {return 1;}?
>
> 5) Why do we need both src/backend/storage/page/checksum.c and
> src/port/checksum.c?
>
> 6)
> > +/* Function declarations for ISA-specific implementations */
> > +uint32 pg_checksum_block_default(const PGChecksummablePage *page);
> > +#ifdef USE_AVX2_WITH_RUNTIME_CHECK
> > +uint32 pg_checksum_block_avx2(const PGChecksummablePage *page);
> > +#endif
>
> What is "ISA-specific implementations" in this comment? Maybe I'm just
> not familiar with the term? Or is it an artifact from macro
> implementation?
>
> 7) Why remove all comments from code of pg_checksum_block_default? I
> could understand if you just removed comments from
> pg_checksum_block_avx2, since it just duplicates code (though I
> personally would leave all the comments even when duplicating code), but
> I don't understand removing comments from pg_checksum_block_default
>
> 8) It might be a personal taste, but pg_checksum_block_dispatch looks
> more like "choose" function from src/port/pg_crc32c_sse42_choose.c and
> alike. "dispatch" from src/include/port/pg_crc32c looks a little
> different - we don't choose function pointer once there, we choose
> between inlined computation and calling a function with runtime check.
> So I'd suggest changing name of pg_checksum_block_dispatch to
> pg_checksum_block_choose
>
> Other than those, I think the core of this patch is good
>
> Oleg Tselebrovskiy, PostgresPro

Attachment Content-Type Size
v7-0001-Enable-autovectorizing-pg_checksum_block-with-AVX2-runtime-detection.patch application/octet-stream 23.3 KB
v7-0002-Benchmark-code-for-postgres-checksums.patch application/octet-stream 5.2 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Philip Alger 2025-10-18 21:34:59 Re: [PATCH] Add pg_get_trigger_ddl() to retrieve the CREATE TRIGGER statement
Previous Message David Rowley 2025-10-18 21:15:07 Re: A tidyup of pathkeys.c