Skip site navigation (1) Skip section navigation (2)

Query gets slow when where clause increases

From: James Antill <james(at)and(dot)org>
To: pgsql-performance(at)postgresql(dot)org
Subject: Query gets slow when where clause increases
Date: 2004-06-29 21:56:36
Message-ID: m3oen281uz.fsf@code.and.org (view raw or flat)
Thread:
Lists: pgsql-performance
 I have a problem where a query gets _much_ slower if I add statements
to the where clause than if I just get everything and filter in the
code. I expected the query to be faster as the where clause gets
bigger (and thus the data to return gets lower).

 I've put inline the SQL and explain analyze of both the general and
specific queries. I assume the problem is something to do with the
fact that in the specific query "ticket_groups" is uses an index and
is assumed to return 5 rows, but actually returns 604 and in the
generic case it doesn't use an index and is assumed to return 3.5-4
thousand and does? Is this right, and if so is it possible to get
postgres to re-think using an index in this case (or possible up it's
estimated row count)?
 Any help appreciated.

----------------------- GENERIC -------------------------------------

SELECT   g.assigned_to, g.opened AS escalated, t.tid, t.opened,
         e.userid AS commiter, e.eid, e.performed_on, e.status
  FROM (events e JOIN (ticket_groups g JOIN tickets t USING(tid)) USING(tid))
   WHERE (g.gid = '37' AND 
          (e.performed_on <= CAST('1088480270' AS bigint)) AND
          ((t.modified >= CAST('1086875491' AS bigint)) OR
           (t.status != '4' AND
            t.status != '5' AND
            t.status != '9')))
   ORDER BY e.userid, t.tid, e.performed_on;

 Sort  (cost=28017.06..28054.97 rows=15166 width=58) (actual time=2057.25..2076.12 rows=26594 loops=1)
   Sort Key: e.userid, t.tid, e.performed_on
   ->  Merge Join  (cost=1251.59..26963.93 rows=15166 width=58) (actual time=231.81..1972.29 rows=26594 loops=1)
         Merge Cond: ("outer".tid = "inner".tid)
         ->  Index Scan using idx_tid_events on events e  (cost=0.00..18943.61 rows=268784 width=26) (actual time=11.48..1358.15 rows=268803 loops=1)
               Filter: (performed_on <= 1088480270::bigint)
         ->  Materialize  (cost=7160.00..7160.00 rows=1725 width=32) (actual time=217.14..237.94 rows=26592 loops=1)
               ->  Merge Join  (cost=1251.59..7160.00 rows=1725 width=32) (actual time=63.14..214.75 rows=983 loops=1)
                     Merge Cond: ("outer".tid = "inner".tid)
                     ->  Index Scan using idx_tickets_tid on tickets t  (cost=0.00..5820.13 rows=18823 width=12) (actual time=2.97..135.57 rows=6020 loops=1)
                           Filter: (((status <> 4::smallint) OR (modified >= 1086875491::bigint)) AND ((status <> 5::smallint) OR (modified >= 1086875491::bigint)) AND ((status <> 9::smallint) OR (modified >= 1086875491::bigint)))
                     ->  Sort  (cost=1251.59..1261.38 rows=3915 width=20) (actual time=60.13..62.96 rows=3699 loops=1)
                           Sort Key: g.tid
                           ->  Seq Scan on ticket_groups g  (cost=0.00..1017.94 rows=3915 width=20) (actual time=0.05..53.02 rows=3699 loops=1)
                                 Filter: (gid = 37)
 Total runtime: 2100.75 msec

--------------------- SPECIFIC -------------------------------------

SELECT   g.assigned_to, g.opened AS escalated, t.tid, t.opened,
         e.userid AS commiter, e.eid, e.performed_on, e.status
  FROM (events e JOIN (ticket_groups g JOIN tickets t USING(tid)) USING(tid))
   WHERE (g.gid = '37' AND 
          (e.performed_on <= CAST('1088480270' AS bigint)) AND
          ((t.modified >= CAST('1086875491' AS bigint)) OR
           (t.status != '4' AND
            t.status != '5' AND
            t.status != '9')) AND (g.assigned_to IS NOT NULL AND g.assigned_to='1540') AND e.userid='1540')
   ORDER BY e.userid, t.tid, e.performed_on;

 Sort  (cost=5079.17..5079.18 rows=1 width=58) (actual time=218121.00..218122.02 rows=1441 loops=1)
   Sort Key: e.userid, t.tid, e.performed_on
   ->  Nested Loop  (cost=0.00..5079.16 rows=1 width=58) (actual time=85.28..218115.36 rows=1441 loops=1)
         Join Filter: ("outer".tid = "inner".tid)
         ->  Nested Loop  (cost=0.00..305.53 rows=1 width=46) (actual time=0.22..261.78 rows=2420 loops=1)
               ->  Index Scan using idx_ticket_groups_assigned on ticket_groups g  (cost=0.00..241.76 rows=5 width=20) (actual time=0.13..12.67 rows=604 loops=1)
                     Index Cond: (assigned_to = 1540)
                     Filter: ((gid = 37) AND (assigned_to IS NOT NULL))
               ->  Index Scan using idx_tid_events on events e  (cost=0.00..12.50 rows=1 width=26) (actual time=0.11..0.38 rows=4 loops=604)
                     Index Cond: (e.tid = "outer".tid)
                     Filter: ((performed_on <= 1088480270::bigint) AND (userid = 1540))
         ->  Seq Scan on tickets t  (cost=0.00..4538.35 rows=18823 width=12) (actual time=0.16..83.53 rows=6020 loops=2420)
               Filter: (((status <> 4::smallint) OR (modified >= 1086875491::bigint)) AND ((status <> 5::smallint) OR (modified >= 1086875491::bigint)) AND ((status <> 9::smallint) OR (modified >= 1086875491::bigint)))
 Total runtime: 218123.24 msec


-- 
James Antill -- james(at)and(dot)org
Need an efficient and powerful string library for C?
http://www.and.org/vstr/

Responses

pgsql-performance by date

Next:From: Scott MarloweDate: 2004-06-29 22:36:55
Subject: Re: High load average with PostgreSQL 7.4.2 on
Previous:From: Bruno Wolff IIIDate: 2004-06-29 19:51:30
Subject: Re: Query performance

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group