Re: [PROPOSAL] Temporal query processing with range types

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Peter Moser <pitiz29a(at)gmail(dot)com>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Johann Gamper <gamper(at)inf(dot)unibz(dot)it>, Michael Böhlen <boehlen(at)ifi(dot)uzh(dot)ch>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Anton Dignös <anton(dot)dignoes(at)unibz(dot)it>
Subject: Re: [PROPOSAL] Temporal query processing with range types
Date: 2017-02-15 19:24:24
Message-ID: CA+TgmobJH-=c44gjudCSaS-vuOQ4XwgFZ9jJBT8A+bvWTVQKTg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jan 24, 2017 at 4:32 AM, Peter Moser <pitiz29a(at)gmail(dot)com> wrote:
>> Using common terms such as ALIGN and NORMALIZE for such a specific
>> functionality seems a bit wrong.
>
> Would ALIGN RANGES/RANGE ALIGN and NORMALIZE RANGES/RANGE NORMALIZE be better
> options? We are also thankful for any suggestion or comments about the syntax.

So it seems like an ALIGN or NORMALIZE option is kind of like a JOIN,
except apparently there's no join type and the optimizer can never
reorder these operations with each other or with other joins. Is that
right? The optimizer changes in this patch seem fairly minimal, so
I'm guessing it can't be doing anything very complex here.

What happens if you perform the ALIGN or NORMALIZE operation using
something other than an equality operator, like, say, less-than? Or
an arbitrary user-defined operator.

There's no documentation in this patch. I'm not sure you want to go
to the trouble of writing SGML documentation until this has been
reviewed enough that it has a real chance of getting committed, but on
the other hand we're obviously all struggling to understand what it
does, so I think if not SGML documentation it at least needs a real
clear explanation of what the syntax is and does in a README or
something, even just for initial review.

We don't have anything quite like this in PostgreSQL today. An
ordinary join just matches up things in relation A and relation B and
outputs the matching rows, and something like a SRF takes a set of
input rows and returns a set of output rows. This is a hybrid - it
takes in two sets of rows, one from each relation being joined, and
produces a derived set of output rows that takes into account both
inputs.

+ * INPUT:
+ * (r ALIGN s ON q WITH (r.ts, r.te, s.ts, s.te)) c
+ * where q can be any join qualifier, and r.ts, r.te, s.ts, and s.t
e
+ * can be any column name.
+ *
+ * OUTPUT:
+ * (
+ * SELECT r.*, GREATEST(r.ts, s.ts) P1, LEAST(r.te, s.te) P2
+ * FROM
+ * (
+ * SELECT *, row_id() OVER () rn FROM r
+ * ) r
+ * LEFT OUTER JOIN
+ * s
+ * ON q AND r.ts < s.te AND r.te > s.ts
+ * ORDER BY rn, P1, P2
+ * ) c

It's hard to see what's going on here. What's ts? What's te? If you
used longer names for these things, it might be a bit more
self-documenting.

If we are going to transform an ALIGN operator in to a left outer
join, why do we also have an executor node for it?

+ fcLowerLarg = makeFuncCall(SystemFuncName("lower"),
+
list_make1(crLargTs),
+
UNKNOWN_LOCATION);
+ fcLowerRarg = makeFuncCall(SystemFuncName("lower"),
+
list_make1(crRargTs),
+
UNKNOWN_LOCATION);
+ fcUpperLarg = makeFuncCall(SystemFuncName("upper"),
+
list_make1(crLargTs),
+
UNKNOWN_LOCATION);
+ fcUpperRarg = makeFuncCall(SystemFuncName("upper"),
+
list_make1(crRargTs),
+
UNKNOWN_LOCATION);

Why is a temporal operator calling functions that upper-case and
lower-case strings? In one sense this whole function (and much of the
nearby code) is very straightforward code and you can see exactly why
it's doing it. In another sense it's totally inscrutable: WHY is it
doing any of that stuff?

- char *strategy; /* partitioning strategy
('list' or 'range') */
- List *partParams; /* List of PartitionElems */
- int location; /* token
location, or -1 if unknown */
+ char *strategy; /* partitioning strategy ('list' or 'range') */
+ List *partParams; /* List of PartitionElems */
+ int location; /* token location, or
-1 if unknown */

I think this is some kind of mistake on your end while generating the
patch. It looks like you patched one version of the source code, and
diffed against another.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-02-15 19:36:44 Re: Parallel Index-only scan
Previous Message Andres Freund 2017-02-15 19:23:20 Re: Should we cacheline align PGXACT?