Re: [PATCH] Incremental sort (was: PoC: Partial sort)

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: James Coleman <jtc331(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Shaun Thomas <shaun(dot)thomas(at)2ndquadrant(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andreas Karlsson <andreas(at)proxel(dot)se>
Subject: Re: [PATCH] Incremental sort (was: PoC: Partial sort)
Date: 2020-04-03 00:20:28
Message-ID: 20200403002028.kp3rxncmwvsui6xz@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Thanks, the v52 looks almost ready. I've been looking at the two or
three things I mentioned, and I have a couple of comments.

1) /* XXX comparison_cost shouldn't be 0? */

I'm not worried about this, because this is not really intriduced by
this patch - create_sort_path has the same comment/issue, so I think
it's acceptable to do the same thing for incremental sort.

2) INITIAL_MEMTUPSIZE

tuplesort.c does this:

#define INITIAL_MEMTUPSIZE Max(1024, \
ALLOCSET_SEPARATE_THRESHOLD / sizeof(SortTuple) + 1)

supposedly to keep the array size within ALLOCSET_SEPARATE_THRESHOLD.
But I think it fails to do that, for a couple of reasons.

Firstly, ALLOCSET_SEPARATE_THRESHOLD is 8192, and SortTuple is 21B at
minimum (without padding), so with 1024 elements it's guaranteed to be
at least 21kB - so exceeding the threshold. The maximum value is
something like 256.

Secondly, the second part of the formula is guaranteed to get us over
the threshold anyway, thanks to the +1. Let's say SortTuple is 30B. Then

ALLOCSET_SEPARATE_THRESHOLD / 30 = 273

but we end up using 274, resulting in 8220B array. :-(

So I guess the formula should be more like

Min(128, ALLOCSET_SEPARATE_THRESHOLD / sizeof(SortTuple))

or something like that.

FWIW I think the whole hypothesis that selecting the array size below
ALLOCSET_SEPARATE_THRESHOLD reduces overhead is dubious. AFAIC we
allocate this only once (or very few times), and if we need to grow the
array we'll hit the threshold anyway.

I'd just pick a reasonable constant - 128 or 256 seems reasonable, 1024
may be OK too.

3) add_partial_path

I'm a bit torn about using enable_incrementalsort in add_partial_path. I
know we've done that to eliminate the overhead, but it just seems weird.
I wonder if that really saves us anything or if it was just noise. I'll
do more testing before making a decision.

While looking at add_path, I see the comparisons there are made in the
opposite order - we first compare costs, and only then pathkeys. That
seems like a more efficient way, I guess (cheaper float comparison
first, pathkey comparison with a loop second). I wonder why it's not
done like that in add_partial_path too? Perhaps this will make it cheap
enough to remove the enable_incrementalsort check.

4) add_partial_path_precheck

While looking at add_partial_path, I realized add_partial_path_precheck
probably needs the same change - to consider startup cost too. I think
the expectation is that add_partial_path_precheck only rejects paths
that we know would then get rejected by add_partial_path.

But now the behavior is inconsistent, which might lead to surprising
behavior (I haven't seen such cases, but I haven't looked for them).
At the moment the add_partial_path_precheck is called only from
joinpath.c, maybe it's not an issue there.

If it's OK to keep it like this, it'd probably deserve a comment why
the difference is OK. And the comments also contain the same claim that
parallel plans only need to look at total cost.

5) Overall, I think the costing is OK. I'm sure we'll find cases that
will need improvements, but that's fine. However, we now have

- cost_tuplesort (used to be cost_sort)
- cost_full_sort
- cost_incremental_sort
- cost_sort

I find it a bit confusing that we have cost_sort and cost_full_sort. Why
don't we just keep using the dummy path in label_sort_with_costsize?
That seems to be the only external caller outside costsize.c. Then we
could either make cost_full_sort static or get rid of it entirely.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2020-04-03 00:24:10 src/test/regress/standby_schedule
Previous Message Andres Freund 2020-04-03 00:17:13 Re: snapshot too old issues, first around wraparound and then more.