| From: | John Naylor <johncnaylorls(at)gmail(dot)com> |
|---|---|
| To: | Ants Aasma <ants(dot)aasma(at)cybertec(dot)at> |
| Cc: | Andrew Kim <tenistarkim(at)gmail(dot)com>, Oleg Tselebrovskiy <o(dot)tselebrovskiy(at)postgrespro(dot)ru>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
| Subject: | Re: Proposal for enabling auto-vectorization for checksum calculations |
| Date: | 2026-03-31 04:09:26 |
| Message-ID: | CANWCAZYrjnCCE6m=5oRs+Ok=sgMrdf33xM25Fxy3yp=kQAoNwA@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Mon, Mar 30, 2026 at 10:01 PM Ants Aasma <ants(dot)aasma(at)cybertec(dot)at> wrote:
>
> On Mon, 30 Mar 2026 at 15:01, John Naylor <johncnaylorls(at)gmail(dot)com> wrote:
> > I don't remember the last time anyone did measurements, so I went
> > ahead and did that:
> >
> > master: 945ms
> > 32 AVX2: 335ms
> > 64 AVX2: 220ms
>
> I'm guessing this is on a recent Intel. Any extra width is helpful on Intel as they doubled vpmulld latency from under us after we had settled on this algorithm.
It's actually ancient and due to be replaced soon, but still several
years after the adoption of this algorithm.
> FWIW I think AVX2 (x86-64-v3) is fine.
Glad to hear it, although the patch doesn't use that build flag, so
it's not impossible there is some additional difference in the
compiler's model. Still, given the variation you found, I'll make sure
the commit message says "several time faster" so it's not specific to
my hardware.
--
John Naylor
Amazon Web Services
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Peter Smith | 2026-03-31 04:22:40 | Re: Skipping schema changes in publication |
| Previous Message | Masahiko Sawada | 2026-03-31 04:08:52 | Re: Initial COPY of Logical Replication is too slow |