Re: Ordered Partitioned Table Scans

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Julien Rouhaud <rjuju123(at)gmail(dot)com>
Cc: Antonin Houska <ah(at)cybertec(dot)at>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Ordered Partitioned Table Scans
Date: 2018-12-19 14:01:14
Message-ID: CAKJS1f9ZxN50T5ayyCDBFwp65mhj3WPRUd4sqUhhK-S2eCreDQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 20 Dec 2018 at 01:58, Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
> I don't see why we should drop this flag. If we know that the
> partitions are naturally ordered, they'll still be ordered after some
> partitions have been prune, so we can skip later checks if we already
> have the information. The only remaining cases this flag doesn't
> cover are:
>
> - partitions are naturally ordered but there's a default partition.
> We could store this information and later check if the default
> partition has been pruned or not
> - partitions are not naturally ordered, but become naturally ordered
> if enough partitions are pruned. I may be wrong but that doesn't seem
> like a very frequent use case to me I'd imagine that in a lot of
> cases either almost no partition are prune (or at least not enough so
> that the remaining one are ordered), or all but one partition is
> pruned),. So keeping a low overhead for the
> almost-no-pruned-partition with naturally ordered partitions case
> still seems like a good idea to me.

I'm objecting to processing for all partitions, but processing for
just non-pruned partitions seems fine to me. If there are 10k
partitions and we pruned none of them, then planning will be slow
anyway. I'm not too worried about slowing it down a further
microsecond or two. It'll be a drop in the ocean. When we have the
live_parts flag in RelOptInfo then we can allow all of the cases
you've mentioned above, we'll just need to look at the non-pruned
partitions, and in partition order, determine if the lowest LIST
partitioned value sorts earlier than some earlier partition's highest
LIST value and disable the optimisation for such cases.

The flag you've mentioned will become redundant when support is added
for the cases you've mentioned above. I don't see any reason not to
support all these cases, once the live_parts flag makes in into
RelOptInfo. I'm also a bit confused at why you think it's so
important to make multi-valued LIST partitions work when no values are
interleaved, but you suddenly don't care about the optimisation when
the interleaved value partitions get pruned. Can you share your
reasoning for that?

If you're really so keen on this flag, can you share the design you
have in mind? If it's just a single bool flag like "parts_ordered",
and that's set to false, then how would you know there is some natural
order when the DEFAULT partition gets pruned? Or are you proposing
multiple flags, maybe two flags, one for when the default is pruned
and one when it's not? If so, I'd question why the default partition
is so special? Pruning of any of the other partitions could turn a
naturally unordered LIST partitioned table into a naturally ordered
partitioned table if the pruned partition happened to be the only one
with interleaved values. Handling only the DEFAULT partition in a
special way seems to violate the principle of least astonishment.

But in short, I just really don't like the flags idea and I'm not
really willing to work on it or put my name on it. I'd much rather
wait then build a proper solution that works in all cases. I feel the
current patch is worthwhile as it stands.

> The multi-level partitioning case is another
> thing that would need to be handled for instance (and that's the main
> reason I couldn't submit a new patch when I was working on it), and
> I'm definitely not arguing to cover it in this patch.

As far as I'm aware, the multi-level partitioning should work just
fine with the current patch. I added code for that a while ago. There
are regression tests to exercise it. I'm not aware of any cases where
it does not work.

> That being
> said, I'll try to have a look at this patch too, but as I said I have
> a lot of catch-up to do in this part of the code, so I'm afraid that
> I'll not be super efficient.

Thanks for your time on this so far.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Chapman Flack 2018-12-19 14:17:24 Re: GraalVM
Previous Message Julien Rouhaud 2018-12-19 12:58:03 Re: Ordered Partitioned Table Scans