Skip site navigation (1) Skip section navigation (2)

Re: LIMIT confuses the planner

From: Kouber Saparev <kouber(at)saparev(dot)com>
To: pgsql-performance(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: LIMIT confuses the planner
Date: 2009-03-24 10:52:24
Message-ID: 49C8BB68.9050004@saparev.com (view raw or flat)
Thread:
Lists: pgsql-performance
Now I am experiencing similar issue with another table, called 
"message", for which there's a conditional index:

CREATE TABLE message (
   message_sid SERIAL PRIMARY KEY,
   from_profile_sid INT NOT NULL REFERENCES profile,
   to_profile_sid INT NOT NULL REFERENCES profile,
   sender_has_deleted BOOLEAN NOT NULL DEFAULT FALSE,
   receiver_has_deleted BOOLEAN NOT NULL DEFAULT FALSE,
   datetime TIMESTAMP NOT NULL DEFAULT NOW(),
   body TEXT
);

CREATE INDEX message_from_profile_idx (from_profile_sid) WHERE NOT 
sender_has_deleted;


So, again... adding a LIMIT makes the planner choose the "wrong" index.


db=# EXPLAIN ANALYZE SELECT 
          message_sid
FROM
   message
WHERE
   from_profile_sid = 1296 AND NOT sender_has_deleted
ORDER BY
   message_sid DESC;
                                                                 QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------
  Sort  (cost=2307.70..2310.74 rows=1215 width=4) (actual 
time=0.040..0.040 rows=2 loops=1)
    Sort Key: message_sid
    Sort Method:  quicksort  Memory: 25kB
    ->  Bitmap Heap Scan on message  (cost=23.59..2245.45 rows=1215 
width=4) (actual time=0.029..0.033 rows=2 loops=1)
          Recheck Cond: ((from_profile_sid = 1296) AND (NOT 
sender_has_deleted))
          ->  Bitmap Index Scan on message_from_profile_idx 
(cost=0.00..23.28 rows=1215 width=0) (actual time=0.022..0.022 rows=2 
loops=1)
                Index Cond: (from_profile_sid = 1296)
  Total runtime: 0.068 ms
(8 rows)




db=# EXPLAIN ANALYZE SELECT
   message_sid
FROM
   message
WHERE
   from_profile_sid = 1296 AND NOT sender_has_deleted
ORDER BY
   message_sid DESC LIMIT 20;
                                                                   QUERY 
PLAN
----------------------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=0.00..1461.12 rows=20 width=4) (actual 
time=0.817..932.398 rows=2 loops=1)
    ->  Index Scan Backward using message_pkey on message 
(cost=0.00..88762.80 rows=1215 width=4) (actual time=0.816..932.395 
rows=2 loops=1)
          Filter: ((NOT sender_has_deleted) AND (from_profile_sid = 1296))
  Total runtime: 932.432 ms
(4 rows)



I had already increased STATISTICS to 1000 for both from_profile_sid and 
sender_has_deleted, and vacuum analyzed respectively (also did reindex), 
but still statistical data is confusing me:


db=# SELECT n_distinct FROM pg_stats WHERE tablename='message' AND 
attname='from_profile_sid';

  n_distinct
------------
        4068
(1 row)

db=# select avg(length) from (select from_profile_sid, count(*) as 
length from message group by from_profile_sid) as freq;

  avg
----------------------
  206.5117822008693663
(1 row)



Any ideas/thoughts?


-- 
Kouber Saparev
http://kouber.saparev.com

In response to

pgsql-performance by date

Next:From: Heikki LinnakangasDate: 2009-03-24 11:07:34
Subject: Re: Why creating GIN table index is so slow than inserting data into empty table with the same index?
Previous:From: Tom LaneDate: 2009-03-24 03:35:12
Subject: Re: Why creating GIN table index is so slow than inserting data into empty table with the same index?

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group