Re: mysterious difference in speed when combining two queries with OR

From: hans <hans(dot)ekbrand(at)sociology(dot)gu(dot)se>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: mysterious difference in speed when combining two queries with OR
Date: 2008-04-23 09:06:08
Message-ID: 20080423090605.GE6957@Spelmaskinen
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Wed, Apr 23, 2008 at 09:58:10AM +0200, A. Kretschmer wrote:
> am Wed, dem 23.04.2008, um 9:23:07 +0200 mailte Hans Ekbrand folgendes:
> > I cannot understand why the following two queries differ so much in execution time (almost ten times)
>
> wild guess: different execution plans.
>
>
> Can you show us the plans? (EXPLAIN ANALYSE SELECT ...)

Query A (first part)

fektest=> explain analyse select distinct moment.mid from moment,timecard where parent = 45 and (pid=17 and timecard.mid = moment.mid) order by moment.mid;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Unique (cost=287.66..288.67 rows=203 width=4) (actual time=11.412..11.622 rows=41 loops=1)
-> Sort (cost=287.66..288.16 rows=203 width=4) (actual time=11.409..11.484 rows=57 loops=1)
Sort Key: moment.mid
-> Hash Join (cost=60.98..279.88 rows=203 width=4) (actual time=2.346..11.182 rows=57 loops=1)
Hash Cond: ("outer".mid = "inner".mid)
-> Seq Scan on timecard (cost=0.00..211.78 rows=1017 width=4) (actual time=0.031..7.427 rows=995 loops=1)
Filter: (pid = 17)
-> Hash (cost=59.88..59.88 rows=444 width=4) (actual time=2.127..2.127 rows=0 loops=1)
-> Seq Scan on moment (cost=0.00..59.88 rows=444 width=4) (actual time=0.027..1.825 rows=199 loops=1)
Filter: (parent = 45)
Total runtime: 11.852 ms
(11 rows)

Query A (second part)

fektest=> explain analyse select distinct moment.mid from moment,timecard where parent = 45 and (pbar = 0) order by moment.mid;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
Unique (cost=192.62..3800.67 rows=20 width=4) (actual time=0.578..109.274 rows=2 loops=1)
-> Nested Loop (cost=192.62..3417.57 rows=153240 width=4) (actual time=0.575..89.546 rows=15324 loops=1)
-> Index Scan using moment_mid_idx on moment (cost=0.00..160.15 rows=20 width=4) (actual time=0.544..3.490 rows=2 loops=1)
Filter: ((parent = 45) AND (pbar = 0))
-> Materialize (cost=192.62..269.24 rows=7662 width=0) (actual time=0.009..21.998 rows=7662 loops=2)
-> Seq Scan on timecard (cost=0.00..192.62 rows=7662 width=0) (actual time=0.007..14.554 rows=7662 loops=1)
Total runtime: 109.870 ms
(7 rows)

Query B

fektest=> EXPLAIN ANALYSE SELECT distinct moment.mid from moment,timecard where parent = 45 and ((pid=17 and timecard.mid = moment.mid) or (pbar = 0)) order by moment.mid;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------
Unique (cost=192.62..102469.31 rows=444 width=4) (actual time=143.444..4838.067 rows=42 loops=1)
-> Nested Loop (cost=192.62..102405.04 rows=25710 width=4) (actual time=143.439..4818.215 rows=15379 loops=1)
Join Filter: ((("inner".pid = 17) OR ("outer".pbar = 0)) AND (("inner".mid = "outer".mid) OR ("outer".pbar = 0)))
-> Index Scan using moment_mid_idx on moment (cost=0.00..154.58 rows=444 width=8) (actual time=0.390..5.954 rows=199 loops=1)
Filter: (parent = 45)
-> Materialize (cost=192.62..269.24 rows=7662 width=8) (actual time=0.001..9.728 rows=7662 loops=199)
-> Seq Scan on timecard (cost=0.00..192.62 rows=7662 width=8) (actual time=0.007..17.007 rows=7662 loops=1)
Total runtime: 4838.786 ms
(8 rows)

> > I should say that this is on postgresql 7.4.16 (debian stable).
>
> Uhh. Why not a recent version? We have 8.3.0...

No particularly good reason, just that I have taken over a production
system and I didn't want to mess up with before I am confident with
it. But I on a test-site I have migrated to 8.1 without problems, so
migration will happen, we just haven't a reason for doing it yet,
since 7.4 has served us well.

> > Can query B be rewritten so that it would execute faster?
>
> Quick and dirty: use both selects (query A) combined with UNION.

I will look into that.

> I guess, with a recent version the planner can use a bitmap index scan
> to perform Query B faster.

That might be a good reason to upgrade :-)

Thanks for your answer.

--
Every non-free program has a lord, a master --
and if you use the program, he is your master.
Learn to master free software: www.ubuntulinux.com

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Hans Ekbrand 2008-04-23 09:22:09 Re: mysterious difference in speed when combining two queries with OR
Previous Message A. Kretschmer 2008-04-23 08:57:04 Re: mysterious difference in speed when combining two queries with OR