Skip site navigation (1) Skip section navigation (2)

Re: Strange planner decision on quite simple select

From: Richard Huxton <dev(at)archonet(dot)com>
To: Markus Wollny <Markus(dot)Wollny(at)computec(dot)de>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Strange planner decision on quite simple select
Date: 2005-10-25 10:07:28
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-performance
Markus Wollny wrote:
> Hello!
> I've got a table BOARD_MESSAGE (message_id int8, thread_id int8, ...)
> with pk on message_id and and a non_unique not_null index on thread_id.
> A count(*) on BOARD_MESSAGE currently yields a total of 1231171 rows,
> the planner estimated a total of 1232530 rows in this table. I've got
> pg_autovacuum running on the database and run an additional nightly
> VACUUM ANALYZE over it every night.
> I've got a few queries of the following type:
> select          * 
>                                 from PUBLIC.BOARD_MESSAGE 
>                                 where THREAD_ID = 3354253 
>                                 order by        MESSAGE_ID asc 
>                                 limit           20 
>                                 offset          0; 
> There are currently roughly 4500 rows with this thread_id in
> BOARD_MESSAGE. Explain-output is like so:
>                                                       QUERY PLAN 
> ------------------------------------------------------------------------
> ---------------------------------------------- 
>  Limit  (cost=0.00..3927.22 rows=20 width=1148) 
>    ->  Index Scan using pk_board_message on board_message
> (cost=0.00..1100800.55 rows=5606 width=1148) 
>          Filter: (thread_id = 3354253) 
> (3 rows) 
> I didn't have the patience to actually complete an explain analyze on
> that one - I cancelled the query on several attempts after more than 40
> minutes runtime. Now I fiddled a little with this statement and tried
> nudging the planner in the right direction like so:

Hmm - it shouldn't take that long. If I'm reading this right, it's 
expecting to have to fetch 5606 rows to match thread_id=3354253 the 20 
times you've asked for. Now, what it probably doesn't know is that 
thread_id is correlated with message_id quite highly (actually, I don't 
know that, I'm guessing). So - it starts at message_id=1 and works 
along, but I'm figuring that it needs to reach message_id's in the 3-4 
million range to see any of the required thread.

1. Try "ORDER BY thread_id,message_id" and see if that nudges things 
your way.
2. Keep #1 and try replacing the index on (thread_id) with 

   Richard Huxton
   Archonet Ltd

In response to

pgsql-performance by date

Next:From: Rich DoughtyDate: 2005-10-25 11:35:06
Subject: Outer join query plans and performance
Previous:From: Markus WollnyDate: 2005-10-25 09:47:57
Subject: Strange planner decision on quite simple select

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group