From: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Peter Moser <pitiz29a(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Johann Gamper <gamper(at)inf(dot)unibz(dot)it>, Michael Böhlen <boehlen(at)ifi(dot)uzh(dot)ch>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Anton Dignös <anton(dot)dignoes(at)unibz(dot)it> |
Subject: | Re: [PROPOSAL] Temporal query processing with range types |
Date: | 2017-02-15 20:33:52 |
Message-ID: | CAKFQuwY3zTTvYySUswWnLL_EL01vevdNfXLn6a0AeGqOS-_Pdw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Feb 15, 2017 at 12:24 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Tue, Jan 24, 2017 at 4:32 AM, Peter Moser <pitiz29a(at)gmail(dot)com> wrote:
> >> Using common terms such as ALIGN and NORMALIZE for such a specific
> >> functionality seems a bit wrong.
> >
> > Would ALIGN RANGES/RANGE ALIGN and NORMALIZE RANGES/RANGE NORMALIZE be
> better
> > options? We are also thankful for any suggestion or comments about the
> syntax.
>
> So it seems like an ALIGN or NORMALIZE option is kind of like a JOIN,
> except apparently there's no join type and the optimizer can never
> reorder these operations with each other or with other joins. Is that
> right? The optimizer changes in this patch seem fairly minimal, so
> I'm guessing it can't be doing anything very complex here.
>
> + * INPUT:
> + * (r ALIGN s ON q WITH (r.ts, r.te, s.ts, s.te)) c
> + * where q can be any join qualifier, and r.ts, r.te, s.ts,
> and s.t
> e
> + * can be any column name.
> + *
> + * OUTPUT:
> + * (
> + * SELECT r.*, GREATEST(r.ts, s.ts) P1, LEAST(r.te, s.te) P2
> + * FROM
> + * (
> + * SELECT *, row_id() OVER () rn FROM r
> + * ) r
> + * LEFT OUTER JOIN
> + * s
> + * ON q AND r.ts < s.te AND r.te > s.ts
> + * ORDER BY rn, P1, P2
> + * ) c
>
> It's hard to see what's going on here. What's ts? What's te? If you
> used longer names for these things, it might be a bit more
> self-documenting.
>
Just reasoning out loud here...
ISTM ts and te are "temporal [range] start" and "temporal [range] end" (or
probably just the common "timestamp start/end")
From what I can see it is affecting an intersection of the two ranges and,
furthermore, splitting the LEFT range into sub-ranges that match up with
the sub-ranges found on the right side. From the example above this seems
like it should be acting on self-normalized ranges - but I may be missing
something by evaluating this out of context.
r1 [1, 6] [ts, te] [time period start, time period end]
s1 [2, 3]
s2 [3, 4]
s3 [5, 7]
r LEFT JOIN s ON (r.ts < s.te AND r.te > s.ts)
r1[1, 6],s1[2, 3] => [max(r.ts, s.ts),min(r.te, s.te)] => r1[1, 6],d[2, 3]
r1[1, 6],s2[3, 4] => [max(t.ts, s.ts),min(r.te, s.te)] => r1[1, 6],d[3, 4]
r1[1, 6],s3[5, 7] => [max(t.ts, s.ts),min(r.te, s.te)] => r1[1, 6],d[5, 6]
Thus the intersection is [2,6] but since s1 has three ranges that begin
between 2 and 6 (i.e., 2, 3, and 5) there are three output records that
correspond to those sub-ranges.
The description in the OP basically distinguishes between NORMALIZE and
ALIGN in that ALIGN, as described above, affects an INTERSECTION on the two
ranges - discarding the non-overlapping data - while NORMALIZE performs the
alignment while also retaining the non-overlapping data.
The rest of the syntax seems to deal with selecting subsets of range
records based upon attribute data.
David J.
From | Date | Subject | |
---|---|---|---|
Next Message | Karl O. Pinc | 2017-02-15 20:53:34 | Re: Patch to implement pg_current_logfile() function |
Previous Message | Jeff Janes | 2017-02-15 20:25:05 | Re: operator_precedence_warning vs make installcheck |