Re: LEFT JOIN and missing values

From: Tomasz Myrta <jasiek(at)klaster(dot)net>
To: mila <me(at)pierro(dot)dds(dot)nl>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: LEFT JOIN and missing values
Date: 2003-03-12 18:47:07
Message-ID: 3E6F80AB.8040709@klaster.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

mila wrote:
> Hi,
>
> I have a problem, like this:
>
> I have 2 tables, T1 (id, val1) -- a vector
> and T2 (id1, id2, val2) -- this is actually a sqare matrix
> in the form of a very long table.
>
> I need to do some operations with them, such as take a
> dot-product of 2 rows of the matrix (encoded as T2), weighted with T1.. .
>
> The query is the following:
>
> select c.id1, d.id1, sum(c.val2*d.val2*T1.val1)
> from T1, T2 c, T2 d
> where c.ID2 = T1.id AND d.id2 = T1.id AND c.id1 < d.id1
> group by c.id1, d.id1;

>
> This gives me all possible pairs of (weighted) dot-products, as I want them.
>
> Now, some 90% of values in the T2 are the same, say, 0.5, so I could
> thouw them away. But then the query above is not valid anymore, since
> if there is mo match on the condition c.ID2 = T1.id AND d.id2 =
> T1.id, nothing is added (it might be that c.id2 is there and d.id2 is
> not!).
>
> I quiess I need a left join + substitute for missing value. Can anyone
> help me with the query? I cannot figure it out, too bad ad left
> joins...
>
> thanks,
> Mila

First let's rewrite your query to explicit joins (I like them very much,
because they show the problem more clearly)

select c.id1, d.id1, sum(c.val2*d.val2*T1.val1)
from
T1
join T2 c on (c.ID2 = T1.id)
join T2 d on (d.id2 = T1.id and c.id1 < d.id1)
group by c.id1, d.id1;

If I understand well you want to delete useless values:

delete from t2 where val2=0.5

your query should look then:

select c.id1, d.id1, sum(
case when c.id2 is null then 0.5 else c.val2 end *
case when d.id2 is null then 0.5 else d.val2 end *T1.val1)
from
T1
left join T2 c on (c.ID2 = T1.id)
left join T2 d on (d.id2 = T1.id and (c.id1 < d.id1 or c.id1 is null))
group by c.id1, d.id1;

I'm not sure about this query... There is a problem - is it possible to
have missing values in your matrix? This solution replaces all missing
values into 0.5. What should happen to "T2 alias d" if there is missing
value in "T2 alias c"? What happens then to clause "c.id1<d.id1" if "c"
is not found?

Regards,
Tomasz Myrta

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Dan Langille 2003-03-12 19:36:08 extending ORDER BY make query much slower
Previous Message mila 2003-03-12 18:24:23 LEFT JOIN and missing values