Re: Some issues with planner and query optimization

From: Richard Huxton <dev(at)archonet(dot)com>
To: Boguk Maxim <astar(at)rambler-co(dot)ru>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Some issues with planner and query optimization
Date: 2007-02-05 14:11:26
Message-ID: 45C73B0E.90503@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Boguk Maxim wrote:
> Postgres 8.1
> Sample test table:
> (all queries done on fresh vacuumed analyzed table with statistics on
> rub_id and news_dtime set to 1000)
> (all table in memory and server do not doing anything other)
>
> media=> \d test_table
> Table "public.test_table"
> Column | Type | Modifiers
> ------------+-----------------------------+-----------
> id | integer |
> rub_id | integer |
> news_id | integer |
> news_dtime | timestamp without time zone |
> Indexes:
> "test_table_pk" UNIQUE, btree (id)
> "test_table_main_idx" btree (rub_id, news_dtime)
>
> media=> select count(*) from test_table;
> count
> ---------
> 5834463
> media=> select count(distinct rub_id) from test_table;
> count
> -------
> 342
>
> Now doing 3 simple query:
>
> First:
> media=> EXPLAIN ANALYZE select * from test_table where rub_id IN (5)
> order by news_dtime limit 20;
> QUERY
> PLAN
> ------------------------------------------------------------------------
> ------------------------------------------------------------------------
> Limit (cost=0.00..10.73 rows=20 width=20) (actual time=0.018..0.121
> rows=20 loops=1)
> -> Index Scan using test_table_main_idx on test_table
> (cost=0.00..29758.11 rows=55447 width=20) (actual time=0.014..0.054
> rows=20 loops=1)
> Index Cond: (rub_id = 5)
> Total runtime: 0.186 ms
>
> Second (almost same but with rub_id 8):
> media=> EXPLAIN ANALYZE select * from test_table where rub_id IN (8)
> order by news_dtime limit 20;
> QUERY
> PLAN
> ------------------------------------------------------------------------
> ------------------------------------------------------------------------
> -
> Limit (cost=0.00..1.98 rows=20 width=20) (actual time=0.019..0.121
> rows=20 loops=1)
> -> Index Scan using test_table_main_idx on test_table
> (cost=0.00..45976.37 rows=463684 width=20) (actual time=0.014..0.054
> rows=20 loops=1)
> Index Cond: (rub_id = 8)
> Total runtime: 0.186 ms
>
>
> Now try with rub_id IN (5,8) (I was assumed query will work 2-10 time
> longer max... With almost same plan)
> But i'm got bad plan/really slow query:
>
> media=> EXPLAIN ANALYZE select * from test_table where rub_id IN (5,8)
> order by news_dtime limit 20;
>
> QUERY PLAN
> ------------------------------------------------------------------------
> ------------------------------------------------------------------------
> -------------
> Limit (cost=103337.45..103337.50 rows=20 width=20) (actual
> time=4437.841..4437.976 rows=20 loops=1)
> -> Sort (cost=103337.45..104624.26 rows=514725 width=20) (actual
> time=4437.836..4437.873 rows=20 loops=1)
> Sort Key: news_dtime
> -> Bitmap Heap Scan on test_table (cost=3818.96..54506.92
> rows=514725 width=20) (actual time=82.139..1100.021 rows=515340 loops=1)
> Recheck Cond: ((rub_id = 5) OR (rub_id = 8))
> -> BitmapOr (cost=3818.96..3818.96 rows=519131 width=0)
> (actual time=80.498..80.498 rows=0 loops=1)
> -> Bitmap Index Scan on test_table_main_idx
> (cost=0.00..409.06 rows=55447 width=0) (actual time=8.342..8.342
> rows=54959 loops=1)
> Index Cond: (rub_id = 5)
> -> Bitmap Index Scan on test_table_main_idx
> (cost=0.00..3409.89 rows=463684 width=0) (actual time=72.146..72.146
> rows=460381 loops=1)
> Index Cond: (rub_id = 8)
> Total runtime: 4458.999 ms
> (11 rows)
>
> Ouch.... 25000 slower...
> Why planner not try two index scan and merge results...

Try: ORDER BY rub_id, news_dtime
Does that give it enough of a hint?
The problem is you're asking for the 20 oldest regardless of rub_id, so
the index isn't as much use as it might be.

Perhaps an index on (news_dtime,rub_id) rather than the other way around?

--
Richard Huxton
Archonet Ltd

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Alvaro Herrera 2007-02-05 15:11:06 Re: Index bloat of 4x
Previous Message Martijn van Oosterhout 2007-02-05 14:06:32 Re: 8.2.1 Compiling Error