Re: distinct not working in a multiple join

From: David Rio Deiros <driodeiros(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: distinct not working in a multiple join
Date: 2006-02-10 02:21:51
Message-ID: 20060210022151.GD29347@milhouse.digitaria.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, Feb 09, 2006 at 07:20:19PM -0500, Tom Lane wrote:
> David Rio Deiros <driodeiros(at)gmail(dot)com> writes:
> > I have some issues with the query attached at the end of this email.
> > If I run that query I got this output ( I have removed some of the
> > fields) despite the distinct clause:
>
> > QC Q&A | www.xxx.com | 44281
> > QC Q&A | www.xxx.com | 44281
> > WhyMAX? | | 44285
>
> Since you removed some fields, no one can tell if this output
> is wrong or not.

Tom,

Thanks for the answer and apologizes for the lack of information.
I think you found the problem already (see bellow). In anycase, just
to clarify:

This is the information about the machine/OS/Psql version:

1. Psql 8.0.4 - Linux 2.6.13

Now, This is the first query I tried:

SELECT
distinct ar.title,
ar.raw_data,
ar.upload_dt,
ar.artifact_id,
g.group_id,
acl.read
FROM
artifact_acl acl,
artifacts a,
artifact_revisions ar,
revisions_to_types rt,
artifact_types at,
groups g
WHERE
a.expire_dt > NOW() and
acl.artifact_id = a.artifact_id and
a.published_revision = ar.revision_id and
ar.revision_id = rt.revision_id and
rt.type_id = at.type_id and
acl.group_id = g.group_id and
a.suppress = false and
at.is_resource = true and
(
acl.group_id = (
select
group_id
from
groups
where
group_nm = 'Marketing'
)
OR
acl.group_id = (
select
group_id
from
groups
where
group_nm = 'Communicators'
)
) and
acl.group_id IN (4,17,54,2,1,123) and
acl.read = true and
((g.back_or_front = 'front') or
(g.group_nm = 'PR Admin'))
ORDER BY
ar.upload_dt DESC
LIMIT 3;

and here you have the output:

-[ RECORD 1 ]---------------------------
title | QC Q&A
raw_data | www.com
upload_dt | 2006-02-09 11:15:04.724525
artifact_id | 44281
group_id | 2
read | t
-[ RECORD 2 ]---------------------------
title | QC Q&A
raw_data | www.com
upload_dt | 2006-02-09 11:15:04.724525
artifact_id | 44281
group_id | 54
read | t
-[ RECORD 3 ]---------------------------
title | WhyMAX?
raw_data |
upload_dt | 2006-02-09 09:25:27.717663
artifact_id | 44061
group_id | 2
read | t

Now, I tried this query, which is the first one but removing the g.group_id
from the select clause. This is the output:

-[ RECORD 1 ]-----------------------------
title | QC Q&A
raw_data | www.com
upload_dt | 2006-02-09 11:15:04.724525
artifact_id | 44281
read | t
-[ RECORD 2 ]----------------------------
title | WhyMAX?
raw_data |
upload_dt | 2006-02-09 09:25:27.717663
artifact_id | 44061
read | t
-[ RECORD 3 ]---------------------------
title | Business Assets-test
raw_data | corpcomm.com/
upload_dt | 2006-02-08 15:58:06.81578
artifact_id | 44280
read | t

Which is the desired output.

> > SELECT
> > distinct ar.title,
> > ar.raw_data,
> > ar.upload_dt,
> > ar.artifact_id,
> > g.group_id,
> > acl.read
> > FROM
>
> The way you formatted that makes me wonder if you think that the
> DISTINCT applies only to the first column. It does not, it applies
> to all the columns together --- that is, it only removes rows that
> are identical in all columns to some other row. So if there were
> some rows that were identical except for group_id, you'd get the
> behavior you described.

Yes, that was the reason, I thought that distinct was only applying
to the first column.

> BTW, I'm not sure I believe this is actually the same query you
> ran. The presence of the "GROUP BY ar.title" clause should have
> provoked errors about ungrouped columns. If this is an exact
> copy of what you did, what Postgres version is this?

It wasn't the same query. Forget about the first email, in this email
you have the queries I launched (sorry again about that).

Now I have to redefine my query because I want to get the second
output but keeping the group_id. Ideas and suggestions are welcome.

Thanks again your help,

David

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Karl O. Pinc 2006-02-10 02:27:44 Re: Request to have VACUUM ignore cost based limits
Previous Message David Fetter 2006-02-10 01:11:57 Re: Insert more than one t-uple in a single sql