From: Michael Glaesemann <grzm@seespotcode.net>
To: Tom Tamulewicz <tomjt7@hotmail.com>
CC: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Slow join query
Date: Fri, 22 Jun 2007 14:51:32 -0500
>
>On Jun 22, 2007, at 13:32 , Tom Tamulewicz wrote:
>>( p.void_flag IS NULL OR p.void_flag = false )
>Just a note: you can rewrite (a IS NULL or a = false) as (a IS NOT
>TRUE). Shouldn't affect performance, but might make your query
>easier to read.
>
>What's the EXPLAIN ANALYZE output for this query?
>>When the query runs, the hard drive lights up for the duration.
>>(I'm confused by this as 'top' reports only 24k of swap in use).
>>My SUSE 9 test machine has 512 Meg of RAM with 300 Meg used by a
>>Java app. Postmaster reports 56 Meg under "top" and has a 52 Meg
>>segment under "ipcs". I've played with the cache size, shared
>>buffers, and OS shmmax with little change in the query performance.
>>
>>Q: Would this query benefit from using a view between these two
>>tables?
>I doubt it, as views are just pre-parsed queries: no data is
>materialized for the view.
>>Q: Any idea why the reported swap usage is so low, yet the query
>>slams the drive? Is postgres not caching this data? If I run the
>>query with the same arguments, it comes right back the second
>>time. If I change the args and re-run, it goes back to the hard
>>drive and takes 30-50 seconds.
>How much is cached depends on shared_buffers, I believe. If the
>result is still cached, that'd explain why running the query with
>the same arguments returns so quickly. You might see some
>improvement using a prepared query, as the server shouldn't have to
>reparse and replan the query. Of course, if you change the
>arguments, it can't use the result that's cached from the previous
>run.
>
>Take this all with an appropriate amount of salt. I'm learning about
> this, too.
>
>Michael Glaesemann
>grzm seespotcode net
>
>
>
>---------------------------(end of
>broadcast)---------------------------
>TIP 9: In versions below 8.0, the planner will ignore your desire to
> choose an index scan if your joining column's datatypes do
>not
> match

 

SELECT p.party_id, p.first_name, p.last_name, pli.address1, pli.city, pli.state FROM customer as p JOIN address as pli ON ( p.party_id = pli.party_id ) WHERE ( p.void_flag IS NULL OR p.void_flag = false )  AND  (first_name like 'B%') AND (last_name like 'S%') AND (pli.state like 'M%') AND (pli.city like 'AL%') ORDER BY last_name, first_name LIMIT 51  

                                                                                                     QUERY PLAN                                       
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..96.48 rows=1 width=2450) (actual time=13459.814..13459.814 rows=0 loops=1)
   ->  Nested Loop  (cost=0.00..96.48 rows=1 width=2450) (actual time=13459.804..13459.804 rows=0 loops=1)
         ->  Index Scan using idx_last_name on customer p  (cost=0.00..50.22 rows=1 width=1209) (actual time=57.812..13048.524 rows=2474 loops=1)
               Index Cond: (((last_name)::text >= 'S'::character varying) AND ((last_name)::text < 'T'::character varying) AND ((first_name)::text >= 'B'::character varying) AND ((first_name)::text < 'C'::character varying))
               Filter: (((void_flag IS NULL) OR (void_flag = false)) AND ((first_name)::text ~~ 'B%'::text) AND ((last_name)::text ~~ 'S%'::text))
         ->  Index Scan using address_pkey on address pli  (cost=0.00..46.23 rows=1 width=1257) (actual time=0.149..0.149 rows=0 loops=2474)
               Index Cond: (("outer".party_id = pli.party_id))
               Filter: (((state)::text ~~ 'M%'::text) AND ((city)::text ~~ 'AL%'::text))
 Total runtime: 13460.292 ms



Picture this – share your photos and you could win big!



Get a preview of Live Earth, the hottest event this summer - only on MSN