BUG #3979: SELECT DISTINCT slow even on indexed column

From: "David Lee" <david_lee(at)bigfix(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #3979: SELECT DISTINCT slow even on indexed column
Date: 2008-02-21 23:34:12
Message-ID: 200802212334.m1LNYCHn084400@wwwmaster.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs


The following bug has been logged online:

Bug reference: 3979
Logged by: David Lee
Email address: david_lee(at)bigfix(dot)com
PostgreSQL version: 8.2.6
Operating system: Ubuntu Feisty Server
Description: SELECT DISTINCT slow even on indexed column
Details:

\d x:

Column | Type | Modifiers
--------------+-----------------------------+-----------
a | integer | not null
b | integer | not null
time | timestamp without time zone | not null
remote_time | timestamp without time zone | not null
ip | inet | not null

The table has 20 million rows.

The table "x" has an index on ("a", "b").

I first tried:
SELECT DISTINCT a, b FROM x

but it was so slow.

I ran EXPLAIN and it showed that the path did not use the index, so I ran:

SET enable_seqscan = off;

and ran the query again.

Although it used the index, the query was still very slow.

Finally, I ran:
SELECT a, b FROM x GROUP BY a, b;

But it was still the same.

Next I created an index on ("a") and ran the query:
SELECT DISTINCT a FROM x

but the same thing happened (first didn't use the index; after turning
seq-scan off, was still slow; tried using GROUP BY, still slow).

The columns "a" and "b" are NOT NULL and has 100 distinct values each. The
indexes are all btree indexes.

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Jeff Davis 2008-02-22 00:37:42 Re: BUG #3979: SELECT DISTINCT slow even on indexed column
Previous Message Simon Riggs 2008-02-21 21:18:32 Re: Incomplete docs for restore_command for hot standby