From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Andrei Lepikhov <lepihov(at)gmail(dot)com> |
Cc: | Richard Guo <guofenglinux(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Consider explicit incremental sort for Append and MergeAppend |
Date: | 2025-05-19 13:21:07 |
Message-ID: | CA+Tgmob-7XM5xsB+Vco5XmRkMYKgFi4xhCTfL2N6WJuqYS93CQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, May 15, 2025 at 9:03 AM Andrei Lepikhov <lepihov(at)gmail(dot)com> wrote:
> 2. IncrementalSort is not always more effective - it depends on the
> column's number of groups. In my experience, a non-cost-based decision
> one day meets the problematic case, and the people who stick with it are
> much more confused than in the case when planner decision connected to
> the costings - they trust the cost model or the cost model tuned by GUCs.
I wonder if this could be fixed in nodeIncrementalSort.c. I think it's
a problem to rely on planner estimates because planner estimates of
the # of groups are quite unreliable. But at runtime, we can notice
whether the groups are small or large and adjust the algorithm
accordingly. In fact, it looks like we already have some logic for
that:
/*
* Sorting many small groups with tuplesort is inefficient. In order to
* cope with this problem we don't start a new group until the current one
* contains at least DEFAULT_MIN_GROUP_SIZE tuples (unfortunately this also
* means we can't assume small groups of tuples all have the same prefix keys.)
* When we have a bound that's less than DEFAULT_MIN_GROUP_SIZE we start looking
* for the new group as soon as we've met our bound to avoid fetching more
* tuples than we absolutely have to fetch.
*/
#define DEFAULT_MIN_GROUP_SIZE 32
But maybe this logic needs to be further refined somehow. I can't
shake the feeling that it's going to be really hard to have every
place that uses incremental sort decide whether to use an incremental
sort or a regular sort -- we should try to get to a place where it's
safe to use an incremental sort when it's possible without worrying
that it might actually be slower.
Or maybe that's not possible for some reason, but it is not clear to
me what that reason might be.
--
Robert Haas
EDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Aleksander Alekseev | 2025-05-19 13:22:43 | Re: Proposal: Make cfbot fail on patches not created by "git format-patch" |
Previous Message | Isaac Morland | 2025-05-19 12:41:06 | Re: Violation of principle that plan trees are read-only |