From: | David Rowley <dgrowleyml(at)gmail(dot)com> |
---|---|
To: | Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io> |
Cc: | PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Allow ordered partition scans in more cases |
Date: | 2023-03-03 11:56:49 |
Message-ID: | CAApHDvohAZLQSW4AiHUKmLGNuHYbi0pves+9_9ik3cAYevc2GQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, 23 Feb 2023 at 02:10, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io> wrote:
> I haven't looked too deeply into it, but it seems reasonable that the whole
> sort would cost cheaper than individual sorts on partitions + incremental
> sorts, except when the the whole sort would spill to disk much more than the
> incremental ones. I find it quite difficult to reason about what that threshold
> should be, but I managed to find a case which could fit in a test:
Thanks for coming up with that test case. It's a little disappointing
to see that so many rows had to be added to get the plan to change. I
wonder if it's really worth testing this particular case. ~1800 rows
is a little more significant than I'd have hoped. The buildfarm has a
few dinosaurs that would likely see a noticeable slowdown from that.
What's on my mind now is if turning 1 Sort into N Sorts is a
particularly good idea from a work_mem standpoint. I see that we don't
do tuplesort_end() until executor shutdown, so that would mean that we
could end up using 1 x work_mem per Sort node. I idly wondered if we
couldn't do tuplesort_end() after spitting out the final tuple when
EXEC_FLAG_REWIND is not set, but that would still mean we could use N
work_mems when EXEC_FLAG_REWIND *is* set. We only really have
visibility of that during execution too, so can't really make a
decision at plan time based on that.
I'm not quite sure if I'm being overly concerned here or not. All it
would take to get a sort per partition today would be to put a
suitable index on just 1 of the partitions. So this isn't exactly a
new problem, it's just making an old problem perhaps a little more
likely. The problem does also exist for things like partition-wise
joins too for Hash and Merge joins. Partition-wise joins are disabled
by default, however.
David
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2023-03-03 12:14:42 | Re: Missing update of all_hasnulls in BRIN opclasses |
Previous Message | Heikki Linnakangas | 2023-03-03 11:34:50 | Re: Testing autovacuum wraparound (including failsafe) |