Quick Links

Re: select ... distinct performance

From:	Martijn van Oosterhout <kleptog(at)svana(dot)org>
To:	Don Bowman <don(at)sandvine(dot)com>
Cc:	"'pgsql-general(at)postgresql(dot)org'" <pgsql-general(at)postgresql(dot)org>
Subject:	Re: select ... distinct performance
Date:	2004-02-01 02:08:15
Message-ID:	20040201020815.GA14470@svana.org
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

On Wed, Jan 28, 2004 at 11:20:30PM -0500, Don Bowman wrote:
> I have a table with a large number of rows (10K in the example below,
> but >1M in some databases). I would like to find the distinct
> values for one of the columns. The column is indexed.
>
> I would have expected that this would be a very fast operation,
> simply walking down the index. In the example below, there is
> only 1 unique value, but it takes 2 seconds. I would have
> expected more like ~50ms.

The problem is that the index doesn't contain info about which rows are
visibile in your current transaction, so it has to load the entire table to
check. Looks like it used the index to avoid a sort step. I don't think
there is a way to write this that doesn't need the whole table.

Hope this helps,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> (... have gone from d-i being barely usable even by its developers
> anywhere, to being about 20% done. Sweet. And the last 80% usually takes
> 20% of the time, too, right?) -- Anthony Towns, debian-devel-announce

In response to

select ... distinct performance at 2004-01-29 04:20:30 from Don Bowman

Browse pgsql-general by date

	From	Date	Subject
Next Message	Martijn van Oosterhout	2004-02-01 02:15:25	Re: Help: System requirements for postgresql 7.4.1
Previous Message	Eric Ridge	2004-02-01 01:56:14	Re: Very slow query - why?