Re: Query optimization problem

From: Zotov <zotov(at)oe-it(dot)ru>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Query optimization problem
Date: 2010-07-22 12:30:11
Message-ID: 4C4839D3.2020805@oe-it.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

20.07.2010 18:31, Robert Haas:
> According to the EXPLAIN ANALYZE output, your "slow" query is
> executing in 0.007 ms, and your "fast" query is executing in 0.026 ms
> (i.e. not as quickly as the slow query). Since you mention that it
> takes 7 s further down, I suspect this is not the real EXPLAIN ANALYZE
> output on the real data that you're having a problem with. You might
> have better luck if you post the actual EXPLAIN ANALYZE output here.
> Incidentally, sorry for not responding sooner to your private email -
> I was on vacation last week. But please do keep all replies on-list
> so that everyone can comment.
>
> All that having been said, I think the issue here is that the query
> planner isn't inferring that d1.ID=<some constant> implies d2.ID=<some
> constant>, even though there's a join clause d1.ID=d2.ID. I'm not
> really sure why it isn't doing that... I suspect Tom Lane is the only
> person who can comment intelligently on that, and he's away this week
> (but if anyone else has an idea, feel free to jump in...).
>
Yes, I have a mistake when EXPLAIN ANALYZE without data.. It create
another plan, because seq scan were faster. Now I send results on real
data (1 million rows)

*Slow Query:*
-------------------------------------------------
test=# EXPLAIN (ANALYZE on, VERBOSE on, COSTS on, BUFFERS off )SELECT
d1.ID, d2.ID
test-# FROM DocPrimary d1
test-# JOIN DocPrimary d2 ON d2.BasedOn=d1.ID
test-# WHERE (d1.ID=234409763
test(# ) OR (d2.ID=234409763);

QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------
Merge Join (cost=2222.72..53967.30 rows=1 width=8) (actual
time=6697.847..6697.847 rows=0 loops=1)
Output: d1.id, d2.id
Merge Cond: (d1.id = d2.basedon)
Join Filter: ((d1.id = 234409763) OR (d2.id = 234409763))
-> Index Scan using id_pk on public.docprimary d1
(cost=0.00..37224.48 rows=1076842 width=4) (actual time=0.016..3184.474
rows=1076795 loops=1)
Output: d1.id, d1.basedon
-> Index Scan using basedon_idx on public.docprimary d2
(cost=0.00..46245.14 rows=1076842 width=8) (actual time=0.011..1861.570
rows=235362 loops=1)
Output: d2.id, d2.basedon
Total runtime: 6697.968 ms
(9 rows)
-----------------------------------------------

*Fast Query:*
----------------------------------------------
test=# EXPLAIN (ANALYZE on, VERBOSE on, COSTS on, BUFFERS off )SELECT
d1.ID, d2.ID
test-# FROM DocPrimary d1
test-# JOIN DocPrimary d2 ON d2.BasedOn=d1.ID
test-# WHERE (d1.ID=234409763 and d2.BasedOn=234409763
test(# ) OR (d2.ID=234409763);
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=9.01..422.70 rows=1 width=8) (actual
time=0.145..0.145 rows=0 loops=1)
Output: d1.id, d2.id
Join Filter: (((d1.id = 234409763) AND (d2.basedon = 234409763)) OR
(d2.id = 234409763))
-> Bitmap Heap Scan on public.docprimary d2 (cost=9.01..136.90
rows=34 width=8) (actual time=0.141..0.141 rows=0 loops=1)
Output: d2.id, d2.basedon
Recheck Cond: ((d2.basedon = 234409763) OR (d2.id = 234409763))
-> BitmapOr (cost=9.01..9.01 rows=34 width=0) (actual
time=0.136..0.136 rows=0 loops=1)
-> Bitmap Index Scan on basedon_idx (cost=0.00..4.62
rows=33 width=0) (actual time=0.078..0.078 rows=0 loops=1)
Index Cond: (d2.basedon = 234409763)
-> Bitmap Index Scan on id_pk (cost=0.00..4.38 rows=1
width=0) (actual time=0.051..0.051 rows=0 loops=1)
Index Cond: (d2.id = 234409763)
-> Index Scan using id_pk on public.docprimary d1 (cost=0.00..8.39
rows=1 width=4) (never executed)
Output: d1.id, d1.basedon
Index Cond: (d1.id = d2.basedon)
Total runtime: 0.233 ms
(15 rows)
------------------------------------------

I use another fast query:
SELECT d1.ID, d2.ID
FROM DocPrimary d1
JOIN DocPrimary d2 ON d2.BasedOn=d1.ID
WHERE (*d1.ID=234409763 and d2.BasedOn=234409763*) OR (d2.ID=234409763)

Bolded part of query was d2.BasedOn=234409763 I replace it because it
can help find way to optimize it automaticaly

So sorry, but i can`t give programmer to do something in Postgres, because
we don`t use it now as supported DB, we think about it and do some tests.
It`s very hard and slow task (support another DB, now we use FireBird,
and plan use another DB, and look for Postgres and MSSQL, maybe support
it both as free and commercial DB solution)
And in our department only 4 (with me) programmers who can programm on
"System Level", and only one of us (doesn`t me) know C/C++
We all programming on Delphi... If we choose Postgres as free DB
platform then I can think about give programmers for Postgress development.

In so large letters my English stay more bad :)

Browse pgsql-hackers by date

  From Date Subject
Next Message Markus Wanner 2010-07-22 12:49:29 Re: dynamically allocating chunks from shared memory
Previous Message Robert Haas 2010-07-22 11:04:32 Re: dynamically allocating chunks from shared memory