Skip site navigation (1) Skip section navigation (2)

Re: Querying distinct values from a large table

From: Igor Lobanov <ilobanov(at)swsoft(dot)com>
To: Richard Huxton <dev(at)archonet(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Querying distinct values from a large table
Date: 2007-01-30 09:33:11
Message-ID: 45BF10D7.3040203@swsoft.com (view raw or flat)
Thread:
Lists: pgsql-performance

Richard Huxton wrote:
>> I have rather large table with about 5 millions of rows and a dozen of 
>> columns. Let's suppose that columns are named 'a', 'b', 'c' etc. I 
>> need to query distinct pairs of ('a';'b') from this table.
 >
> What version of PostgreSQL is it?

8.1.4

> How many distinct values are you getting back from your 5 million rows? 
> If there are too many, an index isn't going to help.

No more than 10,000.

> Can you share the EXPLAIN ANALYSE output? You might want to try 
> increasing work_mem for this one query to speed any sorting.

Real table and colum names are obfuscated because of NDA, sorry.

explain analyze select distinct a, b from tbl

EXPLAIN ANALYZE output is:

  Unique  (cost=500327.32..525646.88 rows=1848 width=6) (actual 
time=52719.868..56126.356 rows=5390 loops=1)
    ->  Sort  (cost=500327.32..508767.17 rows=3375941 width=6) (actual 
time=52719.865..54919.989 rows=3378864 loops=1)
          Sort Key: a, b
          ->  Seq Scan on tbl  (cost=0.00..101216.41 rows=3375941 
width=6) (actual time=16.643..20652.610 rows=3378864 loops=1)
  Total runtime: 57307.394 ms

> How often is the table updated? Clustering might buy you some 
> improvements (but not a huge amount I suspect).

It is updated once per 3-5 seconds.

And one more thing. I don't know if it helps, but column 'a' can have 
value from a limited set: 0, 1 or 2. Column 'b' is also an integer 
(foreign key, actually).

-- 
Igor Lobanov
Internal Development Engineer
SWsoft, Inc.


In response to

Responses

pgsql-performance by date

Next:From: Igor LobanovDate: 2007-01-30 09:36:12
Subject: Re: Querying distinct values from a large table
Previous:From: Richard HuxtonDate: 2007-01-30 09:12:51
Subject: Re: Querying distinct values from a large table

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group