|From:||Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>|
|To:||Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>|
|Cc:||Alexander Kuzmenkov <a(dot)kuzmenkov(at)postgrespro(dot)ru>, Andres Freund <andres(at)anarazel(dot)de>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, Darafei Komяpa Praliaskouski <me(at)komzpa(dot)net>, Antonin Houska <ah(at)cybertec(dot)at>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>|
|Subject:||Re: [HACKERS] [PATCH] Incremental sort|
|Views:||Raw Message | Whole Thread | Download mbox|
On Tue, Apr 3, 2018 at 2:10 PM, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
> I think solving this may be fairly straight-forward. Essentially, until
> now we only had one way to do the sort, so it was OK to make the sort
> implicit by checking if the path is sorted
> if (input not sorted)
> ... add a Sort node ...
> But now we have multiple possible ways to do the sort, with different
> startup/total costs. So the places that create the sorts need to
> actually generate the Sort paths for each sort alternative, and store
> the information in the Sort node (instead of relying on pathkeys).
> Ultimately, this should simplify the createplan.c places making all the
> make_sort calls unnecessary (i.e. the input should be already sorted
> when needed). Otherwise it'd mean the decision needs to be done locally,
> but I don't think that should be needed.
> But it's surely a fairly invasive change to the patch ...
Right, there are situation when incremental sort has lower startup cost,
but higher total cost. In order to find lower cost, we ideally should
paths for both full sort and incremental sort. However, that would increase
number of total pathes, and could slowdown planning time. Another issue
that we don't always generate pathes for sort. And yes, it would be rather
invasive. So, that doesn't look feasible to get into 11.
Intead, I decided to cut usage of incremental sort. Now, incremental sort
is generated only in create_sort_path(). Cheaper path selected between
incremental sort and full sort with taking limit_tuples into account.
That limits usage of incremental sort, however risk of regression by this
patch is also minimal. In fact, incremental sort will be used only when
sort is explicitly specified and simultaneously LIMIT is specified or
dataset to be sorted is large and incremental sort saves disk IO.
Attached patch also incorporates following commits made by Alexander
* Rename fields of IncrementalSortState to snake_case for the sake of
* Rename group test function to isCurrentGroup.
* Comments from Tomas Vondra about nodeIncrementalSort.c
* Add a test for incremental sort.
* Add a separate function to calculate costs of incremental sort.
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
|Next Message||David Rowley||2018-04-06 00:02:11||Re: [HACKERS] path toward faster partition pruning|
|Previous Message||Andrew Gierth||2018-04-05 23:37:42||Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS|