Re: Joining time fields?

From: Bryan Lee Nuse <nuse(at)uga(dot)edu>
To: James David Smith <james(dot)david(dot)smith(at)gmail(dot)com>
Cc: "pgsql-novice(at)postgresql(dot)org" <pgsql-novice(at)postgresql(dot)org>, Oliveiros d'Azevedo Cristina <oliveiros(dot)cristina(at)marktest(dot)pt>
Subject: Re: Joining time fields?
Date: 2012-07-26 16:43:49
Message-ID: 387689AA-524C-4639-9EDB-BED34F25A052@uga.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

Hello James,

Here's an alternative query that I believe gets what you're asking for. It is less clever and succinct than the one Oliveiros suggested, but may be easier to scrutinize if you are carrying a lot of additional columns through to the resulting table (you don't need to GROUP BY a bunch of columns, as you did in the query you posted). Also, I'd suggest saving any WHERE clause for the outer query (no matter what your final code looks like), as that will help in trouble-shooting. Like Oliveiros's query, this one rests upon a cross join, and will therefore be slow if your tables are huge.

Here's a self-contained test:

CREATE TABLE table_one (
date_time TIMESTAMP,
letter char(1)
);

CREATE TABLE table_two (
date_time TIMESTAMP,
letter char(1)
);

INSERT INTO table_one
VALUES ('2012-06-23 11:43:26', 'A'),
('2010-05-16 06:45:20', 'B'),
('2000-01-04 15:57:01', 'C'),
('1993-10-06 04:36:09', 'D');

INSERT INTO table_two
VALUES ('2000-06-23 11:43:26', 'W'),
('1994-05-16 06:45:20', 'X'),
('2008-01-04 15:57:01', 'Y'),
('1994-05-16 06:45:20', 'Z'); -- duplicate value

The query. Two WITH queries, the first to perform the cross join, the second to aggregate the crossed table according to the minimum time interval. These two queries are then JOINed in the main query:

WITH
crossed AS (
SELECT
A.date_time AS dt_a,
A.letter AS let_a,
B.date_time AS dt_b,
B.letter AS let_b,
@(extract(EPOCH FROM A.date_time - B.date_time)) AS interval
FROM
table_one A, table_two B
),
grouped AS (
SELECT dt_a, min(interval) AS interval
FROM crossed
GROUP BY dt_a
)
SELECT C.*
FROM
grouped G
LEFT JOIN
crossed C
USING (dt_a, interval)
ORDER BY dt_a;

Here's the output (recovers the duplicate rows in table_two, as you specified) :

dt_a | dt_b | interval
---------------------+---------------------+-----------
1993-10-06 04:36:09 | 1994-05-16 06:45:20 | 19188551
1993-10-06 04:36:09 | 1994-05-16 06:45:20 | 19188551
2000-01-04 15:57:01 | 2000-06-23 11:43:26 | 14759185
2010-05-16 06:45:20 | 2008-01-04 15:57:01 | 74530099
2012-06-23 11:43:26 | 2008-01-04 15:57:01 | 140989585
(5 rows)

Hope that is helpful...
Bryan

In response to

Responses

Browse pgsql-novice by date

  From Date Subject
Next Message Oliveiros d'Azevedo Cristina 2012-07-26 19:14:12 Re: Joining time fields?
Previous Message Tom Lane 2012-07-26 14:49:52 Re: Is there an answer to the Ultimate Question for PostgreSQL?