Re: [HACKERS] [PATCH] Incremental sort

From: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To: Alexander Kuzmenkov <a(dot)kuzmenkov(at)postgrespro(dot)ru>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, Darafei Komяpa Praliaskouski <me(at)komzpa(dot)net>, Antonin Houska <ah(at)cybertec(dot)at>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] [PATCH] Incremental sort
Date: 2018-04-07 13:56:07
Message-ID: CAPpHfdtKHETXhf062CPvkjpG1wnjQ7rv4uLhZgYQ6VZjwqDYpg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Apr 6, 2018 at 11:40 PM, Alexander Kuzmenkov <
a(dot)kuzmenkov(at)postgrespro(dot)ru> wrote:

> On 06.04.2018 20:26, Tomas Vondra wrote:
>
>> I personally am OK with reducing the scope of the patch like this. It's
>> still beneficial for the common ORDER BY + LIMIT case, which is good. I
>> don't think it may negatively affect other cases (at least I can't think
>> of any).
>>
>
> I think we can reduce it even further. Just try incremental sort along
> with full sort over the cheapest path in create_ordered_paths, and don't
> touch anything else. This is a very minimal and a probably safe start, and
> then we can continue working on other, more complex cases. In the attached
> patch I tried to do this. We probably should also remove changes in
> make_sort() and create a separate function make_incremental_sort() for it,
> but I'm done for today.

I've done further unwedding of sort and incremental sort providing them
separate function for plan createion.

2) Likewise, I've suggested that the claim about abbreviated keys in
>>
> nodeIncrementalsort.c is dubious. No response, and the XXX comment was
>> instead merged into the patch:
>>
>> * XXX The claim about abbreviated keys seems rather dubious, IMHO.
>>
>
> Not sure about that, maybe just use abbreviated keys for the first
> version? Later we can research this more closely and maybe start deciding
> whether to use abbrev on planning stage.

That comes from time when we're trying to make incremental sort to be always
not worse than full sort. Now, we have separate paths for full and
incremental sorts,
and some costing penalty for incremental sort. So, incremental sort should
be
selected only when it's expected to give big win. Thus, we can give up
with this
optimization at least in the initial version.

So, removed.

4) It's not clear to me why INITIAL_MEMTUPSIZE is defined the way it is.
>> There needs to be a comment - the intent seems to be making it large
>> enough to exceed ALLOCSET_SEPARATE_THRESHOLD, but it's not quite clear
>> why that's a good idea.
>>
>
> Not sure myself, let's ask the other Alexander.

I've added comment to INITIAL_MEMTUPSIZE. However, to be fair it's not
invention of this patch. Initial size of memtuples array was so previously.
What this patch did is just move it to the macro.

Also, this note hadn't been adressed yet.

On Sat, Mar 31, 2018 at 11:43 PM, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com
> wrote:

> I'm wondering if a static MIN_GROUP_SIZE is good idea. For example, what
> if the subplan is expected to return only very few tuples (say, 33), but
> the query includes LIMIT 1. Now, let's assume the startup/total cost of
> the subplan is 1 and 1000000. With MIN_GROUP_SIZE 32 we're bound to
> execute it pretty much till the end, while we could terminate after the
> first tuple (if the prefix changes).
>
> So I think we should use a Min(limit,MIN_GROUP_SIZE) here, and perhaps
> this should depend on average group size too.
>

I agree with that. For bounded sort, attached patch now selects minimal
group
size as Min(DEFAULT_MIN_GROUP_SIZE, bound). That should improve
"LIMIT small_number" case.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
incremental-sort-24.patch application/octet-stream 97.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-04-07 13:56:44 Re: WIP: a way forward on bootstrap data
Previous Message Jesper Pedersen 2018-04-07 13:55:19 Re: [HACKERS] Runtime Partition Pruning