From: | John Naylor <johncnaylorls(at)gmail(dot)com> |
---|---|
To: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | branch-free tuplesort partitioning |
Date: | 2024-11-25 12:13:39 |
Message-ID: | CANWCAZbAmaZ7P+ARjS97sJLXsBB5CPZyzFgqNDiqe-L+BqXzug@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Attached is a very rough and limited proof of concept of $subject when
tuplesort uses abbreviated keys. It only works with int64:
Demo:
--setup
drop table if exists test;
create table test (a bigint);
insert into test select (1_000_000_000 * random())::bigint from
generate_series(1,1_000_000,1) i;
vacuum freeze test;
create extension if not exists pg_prewarm ;
select pg_prewarm('test');
set work_mem = '64MB';
\timing
select * from test order by a offset 1_000_000_000;
Time: 238.925 ms
set debug_branchless_sort = 'on'; select * from test order by a offset
1_000_000_000;
Time: 202.977 ms
With the patch, this case is probably held back by our simple
insertion sort, which I haven't changed.
The algorithm with pythonic pseudocode is from [1]. There is also Rust
code under a permissive license by the same authors elsewhere.
To evaluate this technique further, it'll need some work to handle
duplicates well. Then we can run tests on various input distributions.
If that's promising, we'll need to proceed with an idea I had years
ago to separate null/not-null into separate arrays as a prerequisite.
Then, we can probably simplify tuplesort.c some, including removing
each separate template invocation for integer-like keys. I'm also not
using the sort template in the PoC, to simplify development, and it's
not clear this work can be folded back into there without adding a lot
of complexity.
[1] https://orlp.net/blog/branchless-lomuto-partitioning/
--
John Naylor
Amazon Web Services
Attachment | Content-Type | Size |
---|---|---|
v1-0001-Branchless-Lomuto-partitioning.patch | text/x-patch | 7.2 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2024-11-25 12:16:14 | Re: Add support for Tcl 9 |
Previous Message | Yash Jain | 2024-11-25 11:49:24 | What db objects can only be created with superuser? |