Re: range_agg

From: David Fetter <david(at)fetter(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Paul A Jungwirth <pj(at)illuminatedcomputing(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: range_agg
Date: 2020-03-07 22:13:13
Message-ID: 20200307221312.GV13804@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Mar 07, 2020 at 04:06:32PM -0500, Tom Lane wrote:
> I wrote:
> > However, what I'm on about right at the moment is that I don't think
> > there should be any delta in that test at all. As far as I can see,
> > the design idea here is that multiranges will be automatically created
> > over range types, and the user doesn't need to do that. To my mind,
> > that means that they're an implementation detail and should not show up as
> > separately-owned objects, any more than an autogenerated array type does.
>
> Actually ... have you given any thought to just deciding that ranges and
> multiranges are the same type? That is, any range can now potentially
> contain multiple segments? That would eliminate a whole lot of the
> tedious infrastructure hacking involved in this patch, and let you focus
> on the actually-useful functionality.

If we're changing range types rather than constructing a new
multi-range layer atop them, I think it would be helpful to have some
way to figure out quickly whether this new range type was contiguous.
One way to do that would be to include a "range cardinality" in the
data structure which be the number of left ends in it.

One of the things I'd pictured doing with multiranges was along the
lines of a "full coverage" constraint like "During a shift, there can
be no interval that's not covered," which would correspond to a "range
cardinality" of 1.

I confess I'm getting a little twitchy about the idea of eliding the
cases of "one" and "many", though.

> Assuming that that's ok, it seems like we could consider the traditional
> range functions like lower() and upper() to report on the first or last
> range bound in a multirange --- essentially, they ignore any "holes"
> that exist inside the range. And the new functions for multiranges
> act much like array slicing, in that they give you back pieces of a range
> that aren't actually of a distinct type.

So new functions along the lines of lowers(), uppers(), opennesses(),
etc.? I guess this could be extended as needs emerge.

There's another use case not yet covered here that could make this
even more complex, we should probably plan for it: multi-ranges with
weights.

For example,

SELECT weighted_range_union(r)
FROM (VALUES('[0,1)'::float8range), ('[0,3)'), '('[2,5)')) AS t(r)

would yield something along the lines of:

(([0,1),1), ([1,3),2), ([3,5),1))

and wedging that into the range type seems messy. Each range would
then have a cardinality, and each range within would have a weight,
all of which would be an increasingly heavy burden on the common case
where there's just a single range.

Enhancing a separate multirange type to have weights seems like a
cleaner path forward.

Given that, I'm -1 on mushing multi-ranges into a special case of
ranges, or /vice versa/.

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message James Coleman 2020-03-07 22:28:14 Re: [PATCH] Incremental sort (was: PoC: Partial sort)
Previous Message Tom Lane 2020-03-07 22:00:50 Re: ALTER TEXT SEARCH DICTIONARY tab completion