Query optimization problem

From: Zotov <zotov(at)oe-it(dot)ru>
To: pgsql-performance(at)postgresql(dot)org
Subject: Query optimization problem
Date: 2010-07-15 06:12:27
Message-ID: 4C3EA6CB.4070502@oe-it.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

I have a query:

SELECT d1.ID, d2.ID
FROM DocPrimary d1
JOIN DocPrimary d2 ON d2.BasedOn=d1.ID
WHERE (d1.ID=234409763) or (d2.ID=234409763)

i think what QO(Query Optimizer) can make it faster (now it seq scan and on
million records works 7 sec)

SELECT d1.ID, d2.ID
FROM DocPrimary d1
JOIN DocPrimary d2 ON d2.BasedOn=d1.ID
WHERE (d2.BasedOn=234409763) or (d2.ID=234409763)

----------------------
Slow Query
----------------------
test=# EXPLAIN (ANALYZE on, VERBOSE on, COSTS on, BUFFERS off )SELECT d1.ID,
d2.ID
test-# FROM DocPrimary d1
test-# JOIN DocPrimary d2 ON d2.BasedOn=d1.ID
test-# WHERE (d1.ID=234409763) or (d2.ID=234409763);
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=58.15..132.35 rows=2 width=8) (actual time=0.007..0.007
rows=0 loops=1)
Output: d1.id, d2.id
Hash Cond: (d2.basedon = d1.id)
Join Filter: ((d1.id = 234409763) OR (d2.id = 234409763))
-> Seq Scan on public.docprimary d2 (cost=0.00..31.40 rows=2140
width=8) (actual time=0.002..0.002 rows=0 loops=1)
Output: d2.id, d2.basedon
-> Hash (cost=31.40..31.40 rows=2140 width=4) (never executed)
Output: d1.id
-> Seq Scan on public.docprimary d1 (cost=0.00..31.40 rows=2140
width=4) (never executed)
Output: d1.id

------------------
Fast Query
------------------
test=# EXPLAIN (ANALYZE on, VERBOSE on, COSTS on, BUFFERS off )SELECT d1.ID,
d2.ID
test-# FROM DocPrimary d1
test-# JOIN DocPrimary d2 ON d2.BasedOn=d1.ID
test-# WHERE (d2.BasedOn=234409763) or (d2.ID=234409763);
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=8.60..58.67 rows=12 width=8) (actual time=0.026..0.026
rows=0 loops=1)
Output: d1.id, d2.id
-> Bitmap Heap Scan on public.docprimary d2 (cost=8.60..19.31 rows=12
width=8) (actual time=0.023..0.023 rows=0 loops=1)
Output: d2.id, d2.basedon
Recheck Cond: ((d2.basedon = 234409763) OR (d2.id = 234409763))
-> BitmapOr (cost=8.60..8.60 rows=12 width=0) (actual
time=0.018..0.018 rows=0 loops=1)
-> Bitmap Index Scan on basedon_idx (cost=0.00..4.33
rows=11 width=0) (actual time=0.008..0.008 rows=0 loops=1)
Index Cond: (d2.basedon = 234409763)
-> Bitmap Index Scan on id_pk (cost=0.00..4.26 rows=1
width=0) (actual time=0.003..0.003 rows=0 loops=1)
Index Cond: (d2.id = 234409763)
-> Index Scan using id_pk on public.docprimary d1 (cost=0.00..3.27
rows=1 width=4) (never executed)
Output: d1.id, d1.basedon
Index Cond: (d1.id = d2.basedon)

--------------------------------------------
PGver: PostgreSQL 9.0b x86
OS: Win7 x64

---------------------
Create table query:
---------------------

CREATE TABLE docprimary
(
id integer NOT NULL,
basedon integer,
CONSTRAINT id_pk PRIMARY KEY (id)
);
CREATE INDEX basedon_idx
ON docprimary
USING btree
(basedon);

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Ryan Wexler 2010-07-15 06:18:10 Re: performance on new linux box
Previous Message Ben Chobot 2010-07-15 02:50:28 Re: performance on new linux box