autovectorize page checksum code included elsewhere

From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: autovectorize page checksum code included elsewhere
Date: 2023-11-07 02:47:34
Message-ID: 20231107024734.GB729644@nathanxps13
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

(Unfortunately, I'm posting this too late for the November commitfest, but
I'm hoping this will be the first in a series of proposed improvements
involving SIMD instructions for v17.)

Presently, we ask compilers to autovectorize checksum.c and numeric.c. The
page checksum code actually lives in checksum_impl.h, and checksum.c just
includes it. But checksum_impl.h is also used in pg_upgrade/file.c and
pg_checksums.c, and since we don't ask compilers to autovectorize those
files, the page checksum code may remain un-vectorized.

The attached patch is a quick attempt at adding CFLAGS_UNROLL_LOOPS and
CFLAGS_VECTORIZE to the CFLAGS for the aforementioned objects. The gains
are modest (i.e., some system CPU and/or a few percentage points on the
total time), but it seemed like a no-brainer.

Separately, I'm wondering whether we should consider using CFLAGS_VECTORIZE
on the whole tree. Commit fdea253 seems to be responsible for introducing
this targeted autovectorization strategy, and AFAICT this was just done to
minimize the impact elsewhere while optimizing page checksums. Are there
fundamental problems with adding CFLAGS_VECTORIZE everywhere? Or is it
just waiting on someone to do the analysis/benchmarking?


Nathan Bossart
Amazon Web Services:

Attachment Content-Type Size
autovectorize_page_checksums_v1.patch text/x-diff 818 bytes


Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2023-11-07 02:52:30 Re: 2023-11-09 release announcement draft
Previous Message Michael Paquier 2023-11-07 02:41:58 Re: A recent message added to pg_upgade