Re: A qsort template

From: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: A qsort template
Date: 2022-01-12 22:33:17
Message-ID: CAFBsxsEdMW0Jon5s3PMm-HtipoNfN8bK++HDOFNfaFnxfd+7PQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jun 28, 2021 at 8:16 PM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:

[v4 patchset]

Hi Thomas,

(Sorry for the delay -- I have some time to put into this now.)

> The lower numbered patches are all things that are reused in many
> places, and in my humble opinion improve the notation and type safety
> and code deduplication generally when working with common types

I think 0001-0003 have had enough review previously to commit them, as
they are mostly notational. There's a small amount of bitrot, but not
enough to change the conclusions any. Also 0011 with the missing
#undef.

On Thu, Aug 5, 2021 at 7:18 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>
> If somebody wants to get a sense of what the size hit is from all of
> these specializations, I can recommend the diff feature of bloaty:
>
> https://github.com/google/bloaty/blob/master/doc/using.md#size-diffs
>
> Obviously you'd approach this by building postgres without the patch,
> and diffing that baseline to postgres with the patch. And possibly
> variations of the patch, with less or more sort specializations.

Thanks, that's a neat feature! For 0001-0003, the diff shows +700
bytes in memory, so pretty small:

$ bloaty -s vm src/backend/postgres -- src/backend/postgres.orig
FILE SIZE VM SIZE
-------------- --------------
+0.0% +608 +0.0% +608 .text
+0.0% +64 +0.0% +64 .eh_frame
+0.0% +24 +0.0% +24 .dynsym
+0.0% +14 +0.0% +14 .dynstr
+0.0% +2 +0.0% +2 .gnu.version
+0.0% +58 [ = ] 0 .debug_abbrev
+0.1% +48 [ = ] 0 .debug_aranges
+0.0% +1.65Ki [ = ] 0 .debug_info
+0.0% +942 [ = ] 0 .debug_line
+0.1% +26 [ = ] 0 .debug_line_str
+0.0% +333 [ = ] 0 .debug_loclists
-0.0% -23 [ = ] 0 .debug_rnglists
+0.0% +73 [ = ] 0 .debug_str
-1.0% -4 [ = ] 0 .shstrtab
+0.0% +20 [ = ] 0 .strtab
+0.0% +24 [ = ] 0 .symtab
+131% +3.30Ki [ = ] 0 [Unmapped]
+0.0% +7.11Ki +0.0% +712 TOTAL

[back to Thomas]

> I tried to measure a speedup in vacuum, but so far I have not. I did
> learn some things though: While doing that with an uncorrelated index
> and a lot of deleted tuples, I found that adding more
> maintenance_work_mem doesn't help beyond a few MB, because then cache
> misses dominate to the point where it's not better than doing multiple
> passes (and this is familiar to me from work on hash joins). If I
> turned on huge pages on Linux and set min_dynamic_shared_memory so
> that the parallel DSM used by vacuum lives in huge pages, then
> parallel vacuum with a large maintenance_work_mem starts to do much
> better than non-parallel vacuum by improving the TLB misses (as with
> hash joins). I thought that was quite interesting! Perhaps
> bsearch_itemptr might help with correlated indexes with a lot of
> deleted indexes (so not dominated by cache misses), though?
>
> (I wouldn't be suprised if someone comes up with a much better idea
> than bsearch for that anyway... a few ideas have been suggested.)

That's interesting about the (un)correlated index having such a large
effect on cache hit rate! By now there has been some discussion and a
benchmark for dead tuple storage [1]. bit there doesn't seem to be
recent activity on that thread. We might consider adding the ItemPtr
comparator work I did in [2] for v15 if we don't have any of the other
proposals in place by feature freeze. My concern there is the speedups
I observed were observed when the values were comfortably in L2 cache,
IIRC. That would need wider testing.

That said, I think what I'll do next is test the v3-0001 standalone
patch with tuplesort specializations for more data types. I already
have a decent test script that I can build on for this. (this is the
one currently in CI)

Then, I want to do at least preliminary testing of the qsort boundary
parameters.

Those two things should be doable for this commitfest.

[1] https://www.postgresql.org/message-id/CAD21AoAfOZvmfR0j8VmZorZjL7RhTiQdVttNuC4W-Shdc2a-AA%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAFBsxsG_c24CHKA3cWrOP1HynWGLOkLb8hyZfsD9db5g-ZPagA%40mail.gmail.com

--
John Naylor
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2022-01-12 22:43:04 Re: PATCH: generate fractional cheapest paths in generate_orderedappend_path
Previous Message Andrew Dunstan 2022-01-12 22:22:37 Re: Windows vs recovery tests