Re: optimize lookups in snapshot [sub]xip arrays

From: Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: optimize lookups in snapshot [sub]xip arrays
Date: 2022-07-24 12:26:12
Message-ID: 53cd0f9541e3285e092b818ad19524584c5ac2d3.camel@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

В Ср, 13/07/2022 в 10:09 -0700, Nathan Bossart пишет:
> Hi hackers,
>
> A few years ago, there was a proposal to create hash tables for long
> [sub]xip arrays in snapshots [0], but the thread seems to have fizzled out.
> I was curious whether this idea still showed measurable benefits, so I
> revamped the patch and ran the same test as before [1].  Here are the
> results for 60₋second runs on an r5d.24xlarge with the data directory on
> the local NVMe storage:
>
>      writers  HEAD  patch  diff
>     ----------------------------
>      16       659   664    +1%
>      32       645   663    +3%
>      64       659   692    +5%
>      128      641   716    +12%
>      256      619   610    -1%
>      512      530   702    +32%
>      768      469   582    +24%
>      1000     367   577    +57%
>
> As before, the hash table approach seems to provide a decent benefit at
> higher client counts, so I felt it was worth reviving the idea.
>
> The attached patch has some key differences from the previous proposal.
> For example, the new patch uses simplehash instead of open-coding a new
> hash table.  Also, I've bumped up the threshold for creating hash tables to
> 128 based on the results of my testing.  The attached patch waits until a
> lookup of [sub]xip before generating the hash table, so we only need to
> allocate enough space for the current elements in the [sub]xip array, and
> we avoid allocating extra memory for workloads that do not need the hash
> tables.  I'm slightly worried about increasing the number of memory
> allocations in this code path, but the results above seemed encouraging on
> that front.
>
> Thoughts?
>
> [0] https://postgr.es/m/35960b8af917e9268881cd8df3f88320%40postgrespro.ru
> [1] https://postgr.es/m/057a9a95-19d2-05f0-17e2-f46ff20e9b3e%402ndquadrant.com
>

I'm glad my idea has been reborn.

Well, may be simplehash is not bad idea.
While it certainly consumes more memory and CPU instructions.

I'll try to review.

regards,

Yura Sokolov

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Erik Rijkers 2022-07-24 13:39:32 Re: Schema variables - new implementation for Postgres 15
Previous Message Alexander Korotkov 2022-07-24 12:24:42 Re: Custom tuplesorts for extensions