Re: tuple radix sort

From: Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>
To: John Naylor <johncnaylorls(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Peter Geoghegan <pg(at)bowt(dot)ie>
Subject: Re: tuple radix sort
Date: 2025-10-30 01:56:06
Message-ID: 72B9451B-0121-491E-A702-BE8AAEFDB80E@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Oct 29, 2025, at 19:29, John Naylor <johncnaylorls(at)gmail(dot)com> wrote:
>
> On Wed, Oct 29, 2025 at 3:25 PM Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> wrote:
>>> On Oct 29, 2025, at 14:28, John Naylor <johncnaylorls(at)gmail(dot)com> wrote:
>>>
>>> I suspect the challenge
>>> will be multikey sorts when the first key has low cardinality.
>>
>> As you predicted, when the first key has very low cardinality, radix is a little bit slower. I built a test that proves that:
>>
>> ```
>> evantest=# drop table if exists test_multi;
>> evantest=# create unlogged table test_multi (category int, name text);
>> — first column has only 4 distinct values
>
> Thanks for testing. Note it's actually 5 because of rounding.

Yes, 0-4, totally 5.

> Your
> text also seems to have em-dashes and unicode apostrophes where it
> should have dashes / single quotes. That's not great if you expect
> others to try to reproduce.

I just copied the content from psql (running in iTerm). I did a Google search, and found that was because of Mac Mail’s “smart quotes” substitution. Looks like even I manually type in a pair of single quotes, it still does the substitution. I will try to see how to disable that, but I don’t want to switch to another mail app.

> I'm also not thrilled about having to
> remove your psql prompt.
>

I just wanted to show my entire test process, so I simply copied all contents from psql. In future, I will remove psql prompts from reproduce procedure.

> drop table if exists test_multi;
> create unlogged table test_multi (category int, name text);
> insert into test_multi select (random() * 4)::int as category,
> md5(random()::text) || md5(random()::text) as name from
> generate_series(1, 1000000);
> vacuum freeze test_multi;
>
> Anyway, because this table is larger than my first example, the input
> no longer fits into 64MB of work_mem and it switches to an external
> merge sort. Normally I set work_mem to 1GB for testing sorts so I
> don't have to think about it, but neglected to in my first email.

I changed work_men to 1GB and reran the test. As the high cardinality data are still there, so I first reran with data:

```
evantest=# set work_mem = '1GB';
Time: 0.301 ms
evantest=#
evantest=# select * from test_multi order by category, name;
Time: 575.247 ms
evantest=# select * from test_multi order by category, name;
Time: 554.351 ms
evantest=# select * from test_multi order by category, name;
Time: 565.100 ms
evantest=#
evantest=# set wip_radix_sort = 'on';
Time: 0.752 ms
evantest=# select * from test_multi order by category, name;
Time: 558.057 ms
evantest=# select * from test_multi order by category, name;
Time: 565.542 ms
evantest=# select * from test_multi order by category, name;
Time: 559.973 ms
```

With radix_sort on and off, execution time are almost the same.

Then I restore the data to low cardinality, off is still faster than on:
```
evantest=# set wip_radix_sort = ‘off';
Time: 0.549 ms
evantest=# select * from test_multi order by category, name;
Time: 5509.075 ms (00:05.509)
evantest=# select * from test_multi order by category, name;
Time: 5553.566 ms (00:05.554)
evantest=# select * from test_multi order by category, name;
Time: 5598.595 ms (00:05.599)
evantest=# set wip_radix_sort = ‘on';
Time: 0.786 ms
evantest=#
evantest=# select * from test_multi order by category, name;
Time: 5770.964 ms (00:05.771)
evantest=# select * from test_multi order by category, name;
Time: 5779.755 ms (00:05.780)
evantest=# select * from test_multi order by category, name;
Time: 5851.134 ms (00:05.851)
evantest=#
evantest=# set work_mem = '2GB’; # increasing work_mem to 2GB doesn’t help
Time: 0.404 ms
evantest=#
evantest=# select * from test_multi order by category, name;
Time: 5781.005 ms (00:05.781)
evantest=# select * from test_multi order by category, name;
Time: 5826.025 ms (00:05.826)
evantest=# select * from test_multi order by category, name;
Time: 5937.919 ms (00:05.938)
```

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message jian he 2025-10-30 01:57:24 Re: create table like including storage parameter
Previous Message David Rowley 2025-10-30 01:51:52 Re: Fix bogus use of "long" in aset.c