Re: Proposal for enabling auto-vectorization for checksum calculations

From: John Naylor <johncnaylorls(at)gmail(dot)com>
To: Andrew Kim <tenistarkim(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, Oleg Tselebrovskiy <o(dot)tselebrovskiy(at)postgrespro(dot)ru>
Subject: Re: Proposal for enabling auto-vectorization for checksum calculations
Date: 2025-11-14 11:34:37
Message-ID: CANWCAZY940P3wGOQAZWMLQL4MQGGyOu7WBjBEcn_gqcrr+NvAw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Nov 6, 2025 at 6:50 AM Andrew Kim <tenistarkim(at)gmail(dot)com> wrote:
> The v9 patch series is attached.

Thanks! BTW, I've set the commitfest entry to "Needs Review".

I spent some time with the v9-0001 refactoring patch, and I have one
observation that's worth sharing now:

--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -29,8 +29,7 @@
#include "getopt_long.h"
#include "pg_getopt.h"
#include "storage/bufpage.h"
-#include "storage/checksum.h"
-#include "storage/checksum_impl.h"
+#include "port/checksum.h"

Outside the backend it is no longer required to include the
implementation header -- it just links and works correctly. That seems
like a good thing. In fact...

On Tue, Oct 21, 2025 I wrote:
> Looking at commit f04216341dd1, we have at least one example of an
> external program, pg_filedump. If we can keep this working with
> minimal fuss, it should be fine everywhere.

> https://github.com/df7cb/pg_filedump/blob/master/pg_filedump.c#L29
>
> ```
> /* checksum_impl.h uses Assert, which doesn't work outside the server */
> #undef Assert
> #define Assert(X)
>
> #include "storage/checksum.h"
> #include "storage/checksum_impl.h"
> ```

I tried building pg_filedump against the server with just the 0001
patch and it also builds fine leaving out the implementation:

diff --git a/pg_filedump.c b/pg_filedump.c
index 606a85b..0268381 100644
--- a/pg_filedump.c
+++ b/pg_filedump.c
@@ -26,12 +26,7 @@

#include <utils/pg_crc.h>

-/* checksum_impl.h uses Assert, which doesn't work outside the server */
-#undef Assert
-#define Assert(X)
-
-#include "storage/checksum.h"
-#include "storage/checksum_impl.h"
+#include "port/checksum.h"
#include "decode.h"
#include <inttypes.h>

I verified that it does in fact break when built against our master
branch without the impl.h header.

Further, if I replace the above CRC #include to point instead to our
hardware-specific API in port/pg_crc32c.h, it builds fine after
adjusting the typedef that it expects, and interprets the control file
normally:

<pg_control Contents> *********************************************

CRC: Correct
pg_control Version: 1800
Catalog Version: 202511051
...

I'll think more about this.

--
John Naylor
Amazon Web Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2025-11-14 11:39:34 Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
Previous Message Ayush Vatsa 2025-11-14 11:19:30 Clarification on when _PG_init() is invoked for extensions