Re: [HACKERS] [PATCH] Incremental sort

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Cc: Alexander Kuzmenkov <a(dot)kuzmenkov(at)postgrespro(dot)ru>, Andres Freund <andres(at)anarazel(dot)de>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, Darafei Komяpa Praliaskouski <me(at)komzpa(dot)net>, Antonin Houska <ah(at)cybertec(dot)at>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] [PATCH] Incremental sort
Date: 2018-04-06 17:26:23
Message-ID: 861f7398-59fd-2aeb-964d-38b150687f25@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

On 04/06/2018 01:43 AM, Alexander Korotkov wrote:
> Hi!
>
> On Tue, Apr 3, 2018 at 2:10 PM, Tomas Vondra
> <tomas(dot)vondra(at)2ndquadrant(dot)com <mailto:tomas(dot)vondra(at)2ndquadrant(dot)com>> wrote:
>
> I think solving this may be fairly straight-forward. Essentially, until
> now we only had one way to do the sort, so it was OK to make the sort
> implicit by checking if the path is sorted
>
>     if (input not sorted)
>     {
>         ... add a Sort node ...
>     }
>
> But now we have multiple possible ways to do the sort, with different
> startup/total costs. So the places that create the sorts need to
> actually generate the Sort paths for each sort alternative, and store
> the information in the Sort node (instead of relying on pathkeys).
>
> Ultimately, this should simplify the createplan.c places making all the
> make_sort calls unnecessary (i.e. the input should be already sorted
> when needed). Otherwise it'd mean the decision needs to be done locally,
> but I don't think that should be needed.
>
> But it's surely a fairly invasive change to the patch ...
>
>
> Right, there are situation when incremental sort has lower startup cost,
> but higher total cost.  In order to find lower cost, we ideally should
> generate
> paths for both full sort and incremental sort.  However, that would increase
> number of total pathes, and could slowdown planning time.  Another issue
> that we don't always generate pathes for sort.  And yes, it would be rather
> invasive.  So, that doesn't look feasible to get into 11.
>

I agree that's probably not feasible for PG11, considering the freeze is
about 48h from now. Not necessarily because of amount of code needed to
do that (it might be fairly small, actually) but because of the risk of
regressions in other types of plans and lack of time for review/testing.

I do not think this would cause a significant increase in path number.
We already do have the (partially sorted) paths in pathlist, otherwise
v21 wouldn't be able to build the incremental sort path anyway. And the
plans that did the decision in createplan.c could fairly easily do the
decision when constructing the path, I believe.

Looking v21, this affects three different places:

1) create_merge_append_plan

For merge_append the issue is that generate_mergeappend_paths() calls
get_cheapest_path_for_pathkeys(), which however only looks at cheapest
startup/total path, and then simply falls-back to cheapest_total_path if
there are no suitably sorted paths. IMHO if you modify this to also
consider partially-sorted paths, it should work. You'll have to account
for the extra cost of incremental cost, and it needs to be fairly cheap
(perhaps even by first quickly computing some initial cost estimate -
see e.g. initial_cost_nestloop/final_cost_nestloop).

2) create_mergejoin_plan

For mergejoin, the issue is that sort_inner_and_outer() only looks at
cheapest_total_path for both sides, even before computing the merge
pathkeys. So some of the code would have to move into the foreach loop,
and the paths would be picked by get_cheapest_path_for_pathkeys(). This
time only using total cost, per the comment in sort_inner_and_outer().

3) create_gather_merge_plan

This seems fairly simple - the issue is that gather_grouping_paths()
only looks at cheapest_partial_path. Should be a matter of simply
calling the improved get_cheapest_path_for_pathkeys().

Of course, this is all fairly hand-wavy and it may turn out to be fairly
expensive in some cases. But we can use another trick - we don't need to
search through all partially sorted paths, because for each pathkey
prefix there can be just one "best" path for startup cost and one for
total cost. So we could maintain a much shorter list of partially sorted
paths, I think.

>
> Intead, I decided to cut usage of incremental sort.  Now, incremental sort
> is generated only in create_sort_path().  Cheaper path selected between
> incremental sort and full sort with taking limit_tuples into account.
> That limits usage of incremental sort, however risk of regression by this
> patch is also minimal.  In fact, incremental sort will be used only when
> sort is explicitly specified and simultaneously LIMIT is specified or
> dataset to be sorted is large and incremental sort saves disk IO.
>

I personally am OK with reducing the scope of the patch like this. It's
still beneficial for the common ORDER BY + LIMIT case, which is good. I
don't think it may negatively affect other cases (at least I can't think
of any).

It's pretty obvious it may be extremely useful for the other cases too
(particularly for mergejoin on large tables, where it can allow
in-memory sort with quick startup).

But even if you managed to make the necessary code changes, it's
unlikely any experienced committer will look into such significant
change this close to the cutoff. Either they are going to be away for a
weekend, or they are already looking at other patches :-(

> Attached patch also incorporates following commits made by Alexander
> Kuzmenkov:
> * Rename fields of IncrementalSortState to snake_case for the sake of
> consistency.
> * Rename group test function to isCurrentGroup.
> * Comments from Tomas Vondra about nodeIncrementalSort.c
> * Add a test for incremental sort.
> * Add a separate function to calculate costs of incremental sort.
>

Those changes seem fine, but are still a couple of issues remaining:

1) pathkeys_useful_for_ordering() still uses enable_incrementalsort,
which I think is a bad idea. I've complained about it in my review on
31/3, and I don't see any explanation why this is a good idea.

2) Likewise, I've suggested that the claim about abbreviated keys in
nodeIncrementalsort.c is dubious. No response, and the XXX comment was
instead merged into the patch:

* XXX The claim about abbreviated keys seems rather dubious, IMHO.

3) There is a comment at cost_merge_append, despite there being no
relevant changes in that function. Misplaced comment?

4) It's not clear to me why INITIAL_MEMTUPSIZE is defined the way it is.
There needs to be a comment - the intent seems to be making it large
enough to exceed ALLOCSET_SEPARATE_THRESHOLD, but it's not quite clear
why that's a good idea.

5) I do get this warning when building the code:

costsize.c: In function ‘cost_incremental_sort’:
costsize.c:1812:2: warning: ISO C90 forbids mixed declarations and code
[-Wdeclaration-after-statement]
List *presortedExprs = NIL;
^~~~

6) The comment at cost_incremental_sort talks about cost_sort_internal,
but it's cost_sort_tuplesort I guess.

7) The new code in create_sort_path is somewhat ugly, I guess. It's
correct, but it really needs to be made easier to comprehend. I might
have time to look into that tomorrow, but I can't promise that.

Attached is a diff highlighting some of those places, and couple of
minor code formatting fixes.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
incremental-sort-v22-review.diff text/x-patch 4.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2018-04-06 17:33:56 Re: WIP: Covering + unique indexes.
Previous Message Peter Geoghegan 2018-04-06 17:22:34 Re: WIP: Covering + unique indexes.