Re: Joining time fields?

From: "Oliveiros d'Azevedo Cristina" <oliveiros(dot)cristina(at)marktest(dot)pt>
To: "James David Smith" <james(dot)david(dot)smith(at)gmail(dot)com>
Cc: <pgsql-novice(at)postgresql(dot)org>
Subject: Re: Joining time fields?
Date: 2012-07-26 14:33:07
Message-ID: 75BD54EDC7E2455BAD9698A1A61A299B@marktestcr.marktest.pt
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

Hello again, James,

Yes, that makes sense for me and I took it into account when designing the query.
With an inner join you can always find a row in table B to join with table A.
If you have just one row on table A timestamped with '2012-1-1' and another on table B timestamped with '1089-2-1' they will join.

Aint I right?

As I told you, the sgn is the signal of distance. It allows you to join with the record in B with the nearest timestamp, be it after or before.

Best,
Oliveiros
----- Original Message -----
From: James David Smith
To: Oliveiros d'Azevedo Cristina
Cc: pgsql-novice(at)postgresql(dot)org
Sent: Thursday, July 26, 2012 3:22 PM
Subject: Re: [NOVICE] Joining time fields?

Afternoon,

Hmm. If I use INNER JOIN then the rows that have data from table A but not from table B are not present any more. This isn't correct. I want the rows from table A to be there. They want joining with the nearest (in time) data from table B.

To clarify, every row from table A should have a join with table B. The query should take each row of table A, and then find the nearest time in table B. If the query finds two records in table B that are both exactly the same time away from the record in table A (this won't happen very often), then the record in table A should be duplicated and both records from table B joined to each of them respectively.

Does that make sense!?

James


On 26 July 2012 15:14, Oliveiros d'Azevedo Cristina <oliveiros(dot)cristina(at)marktest(dot)pt> wrote:

Howdy, James,

The line takes into account negative differences, the sgn is for that.

On my test case I actually din't have the situation you described.

But I notice that you are using LEFT JOIN instead of INNER JOIN. Can that be the cause for the empty fields?

Best,
Oliveiros

----- Original Message -----
From: James David Smith
To: Oliveiros d'Azevedo Cristina
Cc: pgsql-novice(at)postgresql(dot)org
Sent: Thursday, July 26, 2012 2:40 PM
Subject: Re: [NOVICE] Joining time fields?

Dear Oliveiros,

Thanks for your reply. I've been working on this today, but don't think that the query is quite right. On this line of the query:

// ON y."date_time" = x."date_time" - (sgn*dist )

I think that we also need to take into account a positive difference as well as a negative? I've had a play around with your query and put all my proper field names in. I think it works. it returns the number of rows that I would expect. The only odd thing is that the first 20-30 rows of data, which represent the earliest data_time in table A, do not have any data linked to them in table B. Just empty cells. Your thoughts are appreciated!

SELECT
x."id" as id,
x."person" as person,
x."the_geom_osgb36" as location,
x."date_time" as gps_time,
y."date_time" as microaeth_time,
y."bc" as black_carbon

FROM(
SELECT
a."id",
a."date_time",
a.person,
a.the_geom_osgb36,
MIN((a."date_time" - b."date_time") * sign(EXTRACT(EPOCH FROM (a."date_time" - b."date_time")))) as dist,
sign(EXTRACT(EPOCH FROM (MIN(a."date_time" - b."date_time")))) as sgn
FROM gps_12_07_2012 a, microaeth_12_07_2012 b
WHERE a.person = 'Ben Barratt'

GROUP by
a."id",
a."date_time",
a.person,
a.the_geom_osgb36

) x

LEFT JOIN (SELECT * FROM microaeth_12_07_2012 WHERE person = 'Ben Barratt') y
ON y."date_time" = x."date_time" - (sgn*dist)
OR
y."date_time" = x."date_time" + (sgn*dist)

ORDER BY x.id, x.date_time


On 25 July 2012 12:08, Oliveiros d'Azevedo Cristina <oliveiros(dot)cristina(at)marktest(dot)pt> wrote:

Hello again, James,

I got no feedback from you, but I believe this query is flawed.

Also, I believe you need the other fields from your tables and not just the TIMESTAMP fields.

So, considering your tables have fields like table_one.col_on_one and table_two.col_on_two,
I'd try this query

SELECT x."col_on_one", x."date_time",y."col_one_two",y."date_time"
FROM(
SELECT a."col_on_one",a."date_time", MIN((a."date_time" - b."date_time") * sign(EXTRACT(EPOCH FROM (a."date_time" - b."date_time")))) as dist,
sign(EXTRACT(EPOCH FROM (MIN(a."date_time" - b."date_time")))) as sgn
FROM table_one a,table_two b
GROUP by a."col_on_one",a."date_time"
) x
INNER JOIN t_table_two y
ON y."date_time" = x."date_time" - (sgn*dist )

If it doesn't work, tell me the error it reported and we'll try to fix it

Best,
Oliver

----- Original Message -----
From: Oliveiros d'Azevedo Cristina
To: James David Smith
Cc: pgsql-novice(at)postgresql(dot)org
Sent: Tuesday, July 24, 2012 5:20 PM
Subject: Re: [NOVICE] Joining time fields?

Hi again,
James,

This is untested code. Can you see if it works?
The trouble is that if you have giant tables it will become slow...

Best,
Oliveiros

SELECT date_time_in_a, d.date_time as date_time_in_b
FROM
(
SELECT a.date_time as date_time_in_a, MIN(a.date_time - b.date_time) as dist
FROM table_one a, table_two b
GROUP BY a.date_time
) c
JOIN
table_two d
ON c.dist - c.date_time_in_a = d.date_time

----- Original Message -----
From: James David Smith
To: Oliveiros d'Azevedo Cristina
Cc: pgsql-novice(at)postgresql(dot)org
Sent: Tuesday, July 24, 2012 4:33 PM
Subject: Re: [NOVICE] Joining time fields?

Hi Oliveiros,

Thanks for your time. It's an inner join then hey instead of a left join? Ok, thanks.

In your example of using one record inn table A, and two records in table B that are exactly the same, I would like the result to be a new table with two records in it. I'm happy for the result of the query to duplicate records from table A.

Yes, even if the nearest time is 100 years away I would still like the query to get the right result. There is no limit to how far the 'nearest' time is.

Thank you

James


On 24 July 2012 16:25, Oliveiros d'Azevedo Cristina <oliveiros(dot)cristina(at)marktest(dot)pt> wrote:

Hi, James,

But that wouldn't be a LEFT JOIN, it will be an INNER JOIN.

Because you'll always be able to join a date from table a with some date from table b even if it is 100 years away...
If table a has just one record datetime = 2012-1-1 and table b has two records datetime = 2010-1-1 and datetime = 2011-1-1 then you'd be able to join table a with the second of table b' records.

You won't be able to join only if table b happens to be empty...ain't I right?

What do you mean by the closest time? Do you have some threshold ? Are they allowed to be arbitrarily far away one from each other?

Best,
Oliveiros
----- Original Message -----
From: James David Smith
To: pgsql-novice(at)postgresql(dot)org
Sent: Tuesday, July 24, 2012 3:57 PM
Subject: [NOVICE] Joining time fields?

Hi all,

I wonder if someone could help me out please. I've got two tables, both with a TIMESTAMP field. I'd like to do a left join with them. I'd like to take the date_time from table A, and join it with the nearest date_time from table B. Whether the time from B is before or after the time in A doesn't matter, I just want the closest time. I started with the below query, but it only gets me the column from table B if the time stamp exactly matches which is clearly correct. I'm sure that this should be quite easy but I can't figure it out...!

Select

a.date_time

b.date_time

FROM table_one a

LEFT JOIN table_two b ON a.date_time = b.date_time

Thanks

James

In response to

Responses

Browse pgsql-novice by date

  From Date Subject
Next Message Della-Croce, Gregory 2012-07-26 14:47:17 pgADMIN III access to table joins denied
Previous Message James David Smith 2012-07-26 14:22:51 Re: Joining time fields?