Re: Strange DISTINCT !

From: Oleg Lebedev <olebedev(at)waterford(dot)org>
To: Jean-Christophe Boggio <cat(at)thefreecat(dot)org>
Cc: Postgresql Mailing List <pgsql-sql(at)postgresql(dot)org>
Subject: Re: Strange DISTINCT !
Date: 2001-08-20 19:20:16
Message-ID: 3B8162EF.6AF5E607@waterford.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

I think this is because if you remove duplicates before joining the tables,
then you would join smaller tables, therefore cutting the cost of the join
(and later sorting and removing duplicates).
Say tmp_stat has the size of 1000 and 10 duplicates on the everage for each
distinct tuple. Also, if the size of prefs is 1000, then the cost of joining
(assuming you have no indexing and just use nested loop joins) these two
tables is 1000 x 1000 = 1,000, 000. After this result of the join needs to
be sorted and duplicates removed.
However, if you remove duplicates before joining, the result (cost) of join
is 100 x 1000 = 100, 000, which also reduces the time of sorting and
duplicates removal.
cheers,

Oleg

Jean-Christophe Boggio wrote:

> Hello, can someone explain to me why this query is so slow :
>
> select distinct t.idmembre,p.datecrea
> from tmp_stat t,prefs p
> where p.idmembre=t.idmembre
> limit 5;
>
> And this one is so fast :
>
> select t.idmembre,p.datecrea
> from (select distinct idmembre from tmp_stat) as t,
> prefs p
> where p.idmembre=t.idmembre
> limit 5;
>
> (I currently have idmembre as an index on tmp_stat and prefs)
>
> How does DISTINCT work ? Is this a bug or a misconfigured index ?
>
> --
> Jean-Christophe Boggio
> cat(at)thefreecat(dot)org -o)
> Independant Consultant and Developer /\\
> Delphi, Linux, Perl, PostgreSQL _\_V
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Johannes Grødem 2001-08-20 19:25:42 Re: Finding table constraints
Previous Message Josh Berkus 2001-08-20 18:14:11 Re: user defined function question