Re: optimize several list functions with SIMD intrinsics

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>, John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: optimize several list functions with SIMD intrinsics
Date: 2023-07-10 10:50:59
Message-ID: 1184a66b-efa4-6fc8-ba8c-72db6f3b1352@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 21/04/2023 23:33, Nathan Bossart wrote:
> On Fri, Apr 21, 2023 at 01:50:34PM +0700, John Naylor wrote:
>> On Wed, Mar 8, 2023 at 7:25 AM Nathan Bossart <nathandbossart(at)gmail(dot)com>
>> wrote:
>>> was mostly a fun weekend project, and I don't presently have any concrete
>>> examples of workloads where this might help.
>>
>> It seems like that should be demonstrated before seriously considering
>> this, like a profile where the relevant list functions show up.
>
> Agreed.

Grepping for "tlist_member" and "list_delete_ptr", I don't see any
callers in hot codepaths where this could make a noticeable difference.
So I've marked this as Returned with Feedback in the commitfest.

> I noticed that several of the List functions do simple linear searches that
> can be optimized with SIMD intrinsics (as was done for XidInMVCCSnapshot in
> 37a6e5d). The following table shows the time spent iterating over a list
> of n elements (via list_member_int) one billion times on my x86 laptop.
>
>
> n | head (ms) | patched (ms)
> ------+-----------+--------------
> 2 | 3884 | 3001
> 4 | 5506 | 4092
> 8 | 6209 | 3026
> 16 | 8797 | 4458
> 32 | 25051 | 7032
> 64 | 37611 | 12763
> 128 | 61886 | 22770
> 256 | 111170 | 59885
> 512 | 209612 | 103378
> 1024 | 407462 | 189484

I'm surprised to see an improvement with n=2 and n=2. AFAICS, the
vectorization only kicks in when n >= 8.

--
Heikki Linnakangas
Neon (https://neon.tech)

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Sergei Kornilov 2023-07-10 11:03:19 Re:doc: clarify the limitation for logical replication when REPILICA IDENTITY is FULL
Previous Message Matthias van de Meent 2023-07-10 10:22:38 Re: PATCH: Using BRIN indexes for sorted output