Quick Links

Re: Allow ordered partition scans in more cases

From:	Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
To:	PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc:	David Rowley <dgrowleyml(at)gmail(dot)com>
Subject:	Re: Allow ordered partition scans in more cases
Date:	2023-02-22 13:10:48
Message-ID:	2868814.e9J7NaK4W3@aivenlaptop
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Thank you for improving this optimization !

Le mardi 21 février 2023, 04:14:02 CET David Rowley a écrit :
> I still need to look to see if there's some small amount of data that
> can be loaded into the table to help coax the planner into producing
> the ordered scan for this one. It works fine as-is for ORDER BY a,b
> and ORDER BY a; so I've put tests in for that.

I haven't looked too deeply into it, but it seems reasonable that the whole
sort would cost cheaper than individual sorts on partitions + incremental
sorts, except when the the whole sort would spill to disk much more than the
incremental ones. I find it quite difficult to reason about what that threshold
should be, but I managed to find a case which could fit in a test:

create table range_parted (a int, b int, c int) partition by range(a, b);
create table range_parted1 partition of range_parted for values from (0,0) to
(10,10);
create table range_parted2 partition of range_parted for values from (10,10)
to (20,20);
insert into range_parted(a, b, c) select i, j, k from generate_series(1, 19)
i, generate_series(1, 19) j, generate_series(1, 5) k;
analyze range_parted;
set random_page_cost = 10;
set work_mem = '64kB';
explain (costs off) select * from range_parted order by a,b,c;

It's quite convoluted, because it needs the following:
- estimate the individual partition sorts to fit into work_mem (even if that's
not the case here at runtime)
- estimate the whole table sort to not fit into work_mem
- the difference between the two should be big enough to compensate the
incremental sort penalty (hence raising random_page_cost).

This is completely tangential to the subject at hand, but maybe we have
improvements to do with the way we estimate what type of sort will be
performed ? It seems to underestimate the memory amount needed. I'm not sure
it makes a real difference in real use cases though.

Regards,

--
Ronan Dunklau

In response to

Allow ordered partition scans in more cases at 2023-02-21 03:14:02 from David Rowley

Responses

Re: Allow ordered partition scans in more cases at 2023-03-03 11:56:49 from David Rowley

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Matthias van de Meent	2023-02-22 13:14:02	Re: Ignoring BRIN for HOT updates (was: -udpates seems broken)
Previous Message	Melih Mutlu	2023-02-22 13:04:00	Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication