Re: Simple indexed IN query takes 40 seconds

From: tv(at)fuzzy(dot)cz
To: "Andrus" <kobruleht2(at)hot(dot)ee>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Simple indexed IN query takes 40 seconds
Date: 2008-11-10 17:12:22
Message-ID: 10833.217.77.161.17.1226337142.squirrel@sq.gransy.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Obviously, most of the total cost (cost=327569, time=39749ms) comes from
two operations in the execution plan:

(a) sequential scan on the 'rid' table (cost=195342, time=26347ms) that
produces almost 3.200.000 rows
(b) hash join of the two subresults (cost=240000, time=14000ms)

How many rows are there in the 'rid' table? If the 'IN' clause selects
more than a few percent of the table, the index won't be used as the
sequential scan of the whole table will be faster than random access
(causing a lot of seeks).

Try to:

(a) analyze the table - might help if the stats are too old and don't
reflect current state
(b) increase the statistics target of the table (will give more precise
stats, allowing to select a better plan)
(c) tune the 'cost' parameters of the planner - the default values are
quite conservative, so if you have fast disks (regarding seeks) the
sequential scan may be chosen too early, you may even 'turn off' the
sequential scan

regards
Tomas

> I found that simple IN query on indexed tables takes too much time.
>
> dok and rid have both indexes on int dokumnr columnr and dokumnr is not
> null.
> PostgreSql can use index on dok or event on rid so it can executed fast.
>
> How to make this query to run fast ?
>
> Andrus.
>
>
>
> note: list contain a lot of integers, output below is abbreviated in this
> part.
>
> explain analyze select
> sum(rid.kogus)
> from dok JOIN rid USING(dokumnr)
> where dok.dokumnr in
> (869906,869907,869910,869911,869914,869915,869916,869917,869918,869921,869925,869926,869928,869929,869934,869935,869936,...)
>
> "Aggregate (cost=327569.15..327569.16 rows=1 width=9) (actual
> time=39749.842..39749.846 rows=1 loops=1)"
> " -> Hash Join (cost=83872.74..327537.74 rows=12563 width=9) (actual
> time=25221.702..39697.249 rows=11857 loops=1)"
> " Hash Cond: ("outer".dokumnr = "inner".dokumnr)"
> " -> Seq Scan on rid (cost=0.00..195342.35 rows=3213135 width=13)
> (actual time=0.046..26347.959 rows=3243468 loops=1)"
> " -> Hash (cost=83860.76..83860.76 rows=4792 width=4) (actual
> time=128.366..128.366 rows=4801 loops=1)"
> " -> Bitmap Heap Scan on dok (cost=9618.80..83860.76
> rows=4792 width=4) (actual time=58.667..108.611 rows=4801 loops=1)"
> " Recheck Cond: ((dokumnr = 869906) OR (dokumnr =
> 869907)
> OR (dokumnr = 869910) OR (dokumnr = 869911) OR (dokumnr = 869914) OR
> (dokumnr = 869915) OR (dokumnr = 869916) OR (dokumnr = 869917) OR (dokumnr
> =
> 869918) OR (dokumnr = 869921) OR (dokumnr = 869925) OR (dokumnr = 869926)
> OR
> (dokumnr = 869928) OR (dokumnr = 869929) OR (dokumnr = 869934) OR (dokumnr
> =
> 869935) OR (dokumnr = 869936) OR (dokumnr = 869937) OR (dokumnr = 869940)
> OR
> (dokumnr = 869941) OR (dokumnr = 869945) OR (dokumnr = 869951) OR (dokumnr
> =
> 869964) OR (dokumnr = 869966) OR (dokumnr = 869969) OR (dokumnr = 869974)
> OR
> (dokumnr = 869979) OR (dokumnr = 869986) OR (dokumnr = 869992) OR (dokumnr
> =
> 869993) OR (dokumnr = 869995) OR (dokumnr = 869997) OR (dokumnr = 870007)
> OR
> (dokumnr = 870018) OR (dokumnr = 870021) OR (dokumnr = 870023) OR (dokumnr
> =
> 870025) OR (dokumnr = 870033) OR (dokumnr = 870034) OR (dokumnr = 870036)
> OR
> (dokumnr = 870038) OR (dokumnr = 870043) OR (dokumnr = 870044) OR (dokumnr
> =
> 870046) OR (dokumnr = 870050) OR (dokumnr = 870051) OR (dokumnr = 870053)
> OR
> (dokumnr = 870054) OR (dokumnr = 870055) OR (dokumnr = 870064) OR (dokumnr
> =
> 870066) OR (dokumnr = 870069) OR (dokumnr = 870077) OR (dokumnr = 870079)
> OR
> (dokumnr = 870081) OR (dokumnr = 870084) OR (dokumnr = 870085) OR (dokumnr
> =
> 870090) OR (dokumnr = 870096) OR (dokumnr = 870110) OR (dokumnr = 870111)
> OR
> (dokumnr = 870117) OR (dokumnr = 870120) OR (dokumnr = 870124) OR (dokumnr
> =
> 870130)
> ...
> OR (dokumnr = 890907) OR (dokumnr = 890908))"
> " -> BitmapOr (cost=9618.80..9618.80 rows=4801
> width=0)
> (actual time=58.248..58.248 rows=0 loops=1)"
> " -> Bitmap Index Scan on dok_dokumnr_idx
> (cost=0.00..2.00 rows=1 width=0) (actual time=0.052..0.052 rows=3
> loops=1)"
> " Index Cond: (dokumnr = 869906)"
> " -> Bitmap Index Scan on dok_dokumnr_idx
> (cost=0.00..2.00 rows=1 width=0) (actual time=0.011..0.011 rows=3
> loops=1)"
> " Index Cond: (dokumnr = 869907)"
> " -> Bitmap Index Scan on dok_dokumnr_idx
> (cost=0.00..2.00 rows=1 width=0) (actual time=0.020..0.020 rows=3
> loops=1)"
> " Index Cond: (dokumnr = 869910)"
> " -> Bitmap Index Scan on dok_dokumnr_idx
> (cost=0.00..2.00 rows=1 width=0) (actual time=0.010..0.010 rows=3
> loops=1)"
> " Index Cond: (dokumnr = 869911)"
> " -> Bitmap Index Scan on dok_dokumnr_idx
> (cost=0.00..2.00 rows=1 width=0) (actual time=0.008..0.008 rows=3
> loops=1)"
> " Index Cond: (dokumnr = 869914)"
> ...
> " -> Bitmap Index Scan on dok_dokumnr_idx
> (cost=0.00..2.00 rows=1 width=0) (actual time=0.008..0.008 rows=1
> loops=1)"
> " Index Cond: (dokumnr = 890908)"
> "Total runtime: 39771.385 ms"
>
> "PostgreSQL 8.1.4 on i686-pc-linux-gnu, compiled by GCC
> i686-pc-linux-gnu-gcc (GCC) 3.4.6 (Gentoo 3.4.6-r1, ssp-3.4.5-1.0,
> pie-8.7.9)"
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Anshul Dutta 2008-11-10 17:14:10 paging on windows
Previous Message tv 2008-11-10 16:41:40 Re: slow full table update