Query planner question

From: Ernest E Vogelsinger <ernest(at)vogelsinger(dot)at>
To: pgsql-general(at)postgresql(dot)org
Subject: Query planner question
Date: 2003-06-12 21:37:47
Message-ID: 5.1.1.6.2.20030612224331.03afb828@mail.vogelsinger.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


Maybe someone can shed a light on this:

I have a (test) table, populated with 2M rows:
rid int4, -- primary key
dcid varchar,
dsid varchar,
drid int4,
owid int4,
nxid int4,
usg int4,
--- something more that's not important

Indexes in question:
CREATE INDEX id_owid ON table (owid,nxid,usg) USING BTREE
CREATE INDEX id_dowid ON table (owid, dcid, dsid, drid, nxid, usg)
USING BTREE
CREATE INDEX id_dictid ON table (dcid, dsid, drid, nxid) USING BTREE

Test Run

VACUUM ANALYZE FULL;
VACUUM
EXPLAIN ANALYZE SELECT DISTINCT dcid,dsid,drid FROM table WHERE owid = 1;
NOTICE: QUERY PLAN:
Unique (cost=402.07..402.90 rows=11 width=20) (actual time=10.13..11.49
rows=512 loops=1)
-> Sort (cost=402.07..402.07 rows=111 width=20) (actual
time=10.13..10.46 rows=512 loops=1)
-> Index Scan using id_owid on table (cost=0.00..398.30 rows=111
width=20) (actual time=0.05..4.44 rows=512 loops=1)
Total runtime: 11.95 msec

EXPLAIN ANALYZE SELECT DISTINCT dcid,dsid,drid FROM table WHERE owid is null;
NOTICE: QUERY PLAN:
Unique (cost=126611.73..128034.59 rows=18971 width=20) (actual
time=27515.63..28359.88 rows=513 loops=1)
-> Sort (cost=126611.73..126611.73 rows=189714 width=20) (actual
time=27515.62..27792.29 rows=199131 loops=1)
-> Seq Scan on table (cost=0.00..106885.04 rows=189714
width=20) (actual time=18.76..16467.28 rows=199131 loops=1)
Total runtime: 28633.68 msec

SET enable_seqscan TO off;
SET VARIABLE
EXPLAIN SELECT DISTINCT dcid,dsid,drid FROM table WHERE owid is null;
-- this process uses 34M RSS!!!
NOTICE: QUERY PLAN:
Unique (cost=0.00..7887659.31 rows=18971 width=20) (actual
time=2.57..711940.78 rows=513 loops=1)
-> Index Scan using id_dictid on table (cost=0.00..7886236.46
rows=189714 width=20) (actual time=2.57..710482.07 rows=199131 loops=1)
Total runtime: 711942.76 msec

A couple of questions arise:

1) Why chooses the planner to use id_owid (and not id_dowid as it would
seem logical) for the first query?
2) Why is NO index used for the second query, the only difference being in
the constraint value (owid is set vs. owid is null)?
3) Why does it use id_dictid for the second query when forced to, and not
id_owid or id_dowid?
4) What could I do to get the planner to use the index access method (apart
from setting enable_seqscan to off)?

This is PostgreSQL 7.2.1 on a dual 1GHz / 1GB Dell server
SHMALL = 128MB
SHMMAX = 128MB
shared_buffers = 4000
sort_mem = 4000

Thanks for any pointers/insights/whatever makes sense,

--
>O Ernest E. Vogelsinger
(\) ICQ #13394035
^ http://www.vogelsinger.at/

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Jim C. Nasby 2003-06-12 21:52:50 Re: Postgres performance comments from a MySQL user
Previous Message Dmitri Bichko 2003-06-12 21:24:40 Re: Best pg_dump practices