Re: range_agg

From: David Fetter <david(at)fetter(dot)org>
To: Paul A Jungwirth <pj(at)illuminatedcomputing(dot)com>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: range_agg
Date: 2019-07-05 17:45:32
Message-ID: 20190705174531.GE24679@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jul 05, 2019 at 09:58:02AM -0700, Paul A Jungwirth wrote:
> On Mon, Jul 1, 2019 at 3:38 PM Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> >
> > For getting into core though, it should be a more complete set of
> > related operations. The patch is implicitly introducing the concept of
> > a "multirange" (in this case, an array of ranges), but it's not making
> > the concept whole.
> >
> > What else should return a multirange? This patch implements the union-
> > agg of ranges, but we also might want an intersection-agg of ranges
> > (that is, the set of ranges that are subranges of every input). Given
> > that there are other options here, the name "range_agg" is too generic
> > to mean union-agg specifically.
>
> Thanks for the review!
>
> I like the idea of adding a multirange type that works like range
> types, although I'm not sure I want to build it. :-)
>
> My own motivations for the range_agg patch are for temporal databases,
> where I have two use-cases: checking temporal foreign keys [1] and
> doing a Snodgrass "coalesce" operation [2]. The FKs use-case is more
> important. For coalesce I just immediately UNNEST the array, so a
> multirange would sort of "get in the way". It's no big deal if I can
> cast a multirange to an array, although for something that would run
> on every INSERT/UPDATE/DELETE I'd like to understand what the cast
> would cost us in performance. But coalesce isn't part of SQL:2011 and
> should be optional behavior or just something in an extension. The FKs
> use case matters to me a lot more, and I think a multirange would be
> fine for that. Also a multirange would solve the generics problem
> Pavel asked about. (I'll say more about that in a reply to him.)
>
> I'm not convinced that an intersection aggregate function for
> multiranges would be used by anyone, but I don't mind building one.
> Every other *_agg function has an "additive" sense, not a
> "subtractive" sense. json{,b}_object_agg are the closest since you
> *could* imagine intersection semantics for those, but they are unions.
> So in terms of *naming* I think using "range_agg" for union semantics
> is natural and would fit expectations. (I'm not the first to name this
> function range_agg btw: [3]).
>
> But there is clearly more than one worthwhile way to aggregate ranges:
>
> - David suggested weighted_range_agg and covering_range_agg. At PGCon

If I understand the cases correctly, the combination of covering_range
and multi_range types covers all cases. To recap, covering_range_agg
assigns a weight, possibly 0, to each non-overlapping sub-range. A
cast from covering_range to multi_range would meld adjacent ranges
with non-zero weights into single ranges in O(N) (N sub-ranges) time
and some teensy amount of memory for tracking progress of adjacent
ranges of non-zero weight. Gaps would just be multi_range consisting
of the sub-ranges of the covering_range with weight 0, and wouldn't
require any tracking.

What have I missed?

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Paul A Jungwirth 2019-07-05 17:57:53 Re: range_agg
Previous Message Paul A Jungwirth 2019-07-05 17:38:26 Re: range_agg