Re: ASOF join

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
Cc: David Fetter <david(at)fetter(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ASOF join
Date: 2017-06-21 08:00:27
Message-ID: CAEepm=0DmOHz06TA19XL2_YyJ+rg0Bsc1z_TRYkd75hFp6e8oQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jun 19, 2017 at 11:57 PM, Konstantin Knizhnik
<k(dot)knizhnik(at)postgrespro(dot)ru> wrote:
> I attached simple patch adding ASOF join to Postgres. Right now it support
> only outer join and requires USING clause (consequently it is not possible
> to join two tables which joi keys has different names. May be it is also
> possible to support ON clause with condition written like o.k1 = i.k2 AND
> o.k2 = i.k2 AND ... AND o.kN >= i.kN
> But such notation can be confusing, because join result includes only one
> matching inner record with kN smaller or equal than kN of outer record and
> not all such records.
> As alternative we can add specia

Hmm. Yeah, I see the notational problem. It's hard to come up with a
new syntax that has SQL nature. What if... we didn't use a new syntax
at all, but recognised existing queries that are executable with this
strategy? Queries like this:

WITH ticks(time, price) AS
(VALUES ('2017-07-20 12:00:00'::timestamptz, 100.00),
('2017-07-21 11:00:00'::timestamptz, 150.00)),
times(time) AS
(VALUES ('2017-07-19 12:00:00'::timestamptz),
('2017-07-20 12:00:00'::timestamptz),
('2017-07-21 12:00:00'::timestamptz),
('2017-07-22 12:00:00'::timestamptz))

SELECT times.time, previous_tick.price
FROM times
LEFT JOIN LATERAL (SELECT * FROM ticks
WHERE ticks.time <= times.time
ORDER BY ticks.time DESC LIMIT 1) previous_tick ON true
ORDER BY times.time;

time | price
------------------------+--------
2017-07-19 12:00:00+12 |
2017-07-20 12:00:00+12 | 100.00
2017-07-21 12:00:00+12 | 150.00
2017-07-22 12:00:00+12 | 150.00
(4 rows)

I haven't used LATERAL much myself but I've noticed that it's often
used to express this type of thing. "Get me the latest ... as of time
...".

It'd a bit like the way we recognise EXISTS (...) as a semi-join and
execute it with a join operator instead of having a SEMI JOIN syntax.
On the other hand it's a bit more long winded, extreme and probably
quite niche.

--
Thomas Munro
http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mengxing Liu 2017-06-21 08:30:21 Re: [GSOC][weekly report 3] Eliminate O(N^2) scaling from rw-conflict tracking in serializable transactions
Previous Message Etsuro Fujita 2017-06-21 07:59:43 Useless code in ExecInitModifyTable