Cannot do time-aligned listing with FULL JOIN, get ERROR: FULL JOIN is only supported with merge-joinable join conditions

From: "John Koerber" <johnk(at)musicreports(dot)com>
To: <pgsql-novice(at)postgresql(dot)org>
Cc: "'Roger Ging'" <rging(at)musicreports(dot)com>
Subject: Cannot do time-aligned listing with FULL JOIN, get ERROR: FULL JOIN is only supported with merge-joinable join conditions
Date: 2009-11-05 21:48:12
Message-ID: 1924389003.228191257457692836.JavaMail.root@mri-mail
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

Hello,

FULL JOIN requiring merge-joinable conditions has been covered previously
in http://archives.postgresql.org/pgsql-sql/2006-01/msg00080.php for a
different application; here is a new context in which FULL JOIN in
non-merge-joinable conditions might be useful.

In this example, events have a start_time and an end_time. I want a
time-aligned listing of all events from two tables A and B, joined up
where there is any time overlap between A's events with B's events. Here
are the two tables:

CREATE TABLE public.events_a AS
SELECT 'A1' as event_a, '2009-01-03'::date as start_a,
'2009-01-03'::date as end_a
UNION SELECT 'A2' , '2009-01-07'::date ,
'2009-01-08'::date
UNION SELECT 'A3' , '2009-01-14'::date ,
'2009-01-15'::date
;
CREATE TABLE public.events_b AS
SELECT 'B1' as event_b, '2009-01-04'::date as start_b,
'2009-01-06'::date as end_b
UNION SELECT 'B2' , '2009-01-07'::date ,
'2009-01-12'::date
UNION SELECT 'B3' , '2009-01-13'::date ,
'2009-01-13'::date
;
A FULL JOIN would be good because presumably it would require only two
sequential scans, but it appears the OVERLAPS operator creates a
non-merge-joinable condition; this query returns ERROR: FULL JOIN is only
supported with merge-joinable join conditions / SQL state: 0A000:

SELECT *
FROM public.events_a A
FULL JOIN public.events_b B
ON (start_a, end_a) OVERLAPS (start_b, end_b)
ORDER BY coalesce(start_a, start_b), start_b ;

As noted in other posts to the archive listing above, the answer I am
looking for can be had by UNIONing a LEFT join and a RIGHT JOIN:

SELECT *
FROM
(
SELECT *
FROM public.events_a A
LEFT JOIN public.events_b B
ON (start_a, end_a) OVERLAPS (start_b, end_b)
UNION SELECT *
FROM public.events_a A
RIGHT OUTER JOIN public.events_b B
ON (start_a, end_a) OVERLAPS (start_b, end_b)
) U
ORDER BY COALESCE(start_a, start_b), start_b;

but this requires 4 sequential scans:

EXPLAIN above query:
QUERY PLAN
Sort (cost=298586.19..300828.86 rows=897066 width=80)
Sort Key: (COALESCE(u.start_a, u.start_b)), u.start_b
-> Subquery Scan u (cost=185220.13..209889.44 rows=897066 width=80)
-> Unique (cost=185220.13..200918.78 rows=897066 width=80)
-> Sort (cost=185220.13..187462.79 rows=897066 width=80)
Sort Key: a.event_a, a.start_a, a.end_a, b.event_b,
b.start_b, b.end_b
-> Append (cost=22.76..96523.38 rows=897066
width=80)
-> Nested Loop Left Join (cost=22.76..43776.36
rows=448533 width=80)
Join Filter:
"overlaps"((a.start_a)::timestamp with time zone, (a.end_a)::timestamp
with time zone, (b.start_b)::timestamp with time zone,
(b.end_b)::timestamp with time zone)
-> Seq Scan on events_a a
(cost=0.00..21.60 rows=1160 width=40)
-> Materialize (cost=22.76..34.36
rows=1160 width=40)
-> Seq Scan on events_b b
(cost=0.00..21.60 rows=1160 width=40)
-> Nested Loop Left Join (cost=22.76..43776.36
rows=448533 width=80)
Join Filter:
"overlaps"((a.start_a)::timestamp with time zone, (a.end_a)::timestamp
with time zone, (b.start_b)::timestamp with time zone,
(b.end_b)::timestamp with time zone)
-> Seq Scan on events_b b
(cost=0.00..21.60 rows=1160 width=40)
-> Materialize (cost=22.76..34.36
rows=1160 width=40)
-> Seq Scan on events_a a
(cost=0.00..21.60 rows=1160 width=40)

In practice, is there:
* A different structuring the time spans of the 'events' such that a
merge-joinable condition can be found?
* A possibility of teaching Hash Join to do the FULL JOIN on
non-merge-joinable conditions?

Thanks,

John Koerber
Senior Systems Developer / Analyst

Responses

Browse pgsql-novice by date

  From Date Subject
Next Message Tom Lane 2009-11-05 22:14:40 Re: Cannot do time-aligned listing with FULL JOIN, get ERROR: FULL JOIN is only supported with merge-joinable join conditions
Previous Message bob.vanslobig 2009-11-05 18:06:29 SQL Server