Skip site navigation (1) Skip section navigation (2)

query plan with index having a btrim is different for strings of different length

From: Richard Yen <dba(at)richyen(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: query plan with index having a btrim is different for strings of different length
Date: 2008-12-09 19:56:45
Message-ID: 751F49DC-1142-4E9C-95C8-BC3C229B3F0B@richyen.com (view raw or flat)
Thread:
Lists: pgsql-performance
Hi,

I've discovered a peculiarity with using btrim in an index and was  
wondering if anyone has any input.

My table is like this:
                      Table "public.m_object_paper"
        Column        |          Type          |       Modifiers
---------------------+------------------------+------------------------
  id                  | integer                | not null
  title               | character varying(200) | not null
  x_firstname         | character varying(50)  |
  x_lastname          | character varying(50)  |
<...snip...>
  page_count          | smallint               |
  compare_to_database | bit varying            | not null
Indexes:
     "m_object_paper_pkey" PRIMARY KEY, btree (id)
     "last_name_fnc_idx" btree (lower(btrim(x_lastname::text)))
     "m_object_paper_assignment_idx" btree (assignment)
     "m_object_paper_owner_idx" btree (owner) CLUSTER
<...snip to end...>

My query is like this:
SELECT m_object_paper.id FROM m_object_paper, m_assignment WHERE  
m_object_paper.assignment = m_assignment.id AND  
m_object_paper.owner=-1 AND m_assignment.class = 2450798 AND  
lower(btrim(x_firstname)) = lower(btrim($FIRSTNAME)) and  
lower(btrim(x_lastname)) = lower(btrim($LASTNAME));

Strangely, if $LASTNAME is 5 chars, the query plan looks like this:
tii=# explain SELECT m_object_paper.id FROM m_object_paper,  
m_assignment WHERE m_object_paper.assignment = m_assignment.id AND  
m_object_paper.owner=-1 AND m_assignment.class = 2450798 AND  
lower(btrim(x_firstname)) = lower(btrim('Jordan')) and  
lower(btrim(x_lastname)) = lower(btrim('Smith'));
                                                   QUERY PLAN
---------------------------------------------------------------------------------------------------------------
  Hash Join  (cost=181704.85..291551.77 rows=1 width=4)
    Hash Cond: (m_object_paper.assignment = m_assignment.id)
    ->  Bitmap Heap Scan on m_object_paper  (cost=181524.86..291369.66  
rows=562 width=8)
          Recheck Cond: ((lower(btrim((x_lastname)::text)) =  
'smith'::text) AND (owner = (-1)))
          Filter: (lower(btrim((x_firstname)::text)) = 'jordan'::text)
          ->  BitmapAnd  (cost=181524.86..181524.86 rows=112429 width=0)
                ->  Bitmap Index Scan on last_name_fnc_idx   
(cost=0.00..5468.29 rows=496740 width=0)
                      Index Cond: (lower(btrim((x_lastname)::text)) =  
'smith'::text)
                ->  Bitmap Index Scan on m_object_paper_owner_idx   
(cost=0.00..176056.04 rows=16061244 width=0)
                      Index Cond: (owner = (-1))
    ->  Hash  (cost=177.82..177.82 rows=174 width=4)
          ->  Index Scan using m_assignment_class_idx on m_assignment   
(cost=0.00..177.82 rows=174 width=4)
                Index Cond: (class = 2450798)
(13 rows)

However, if $LASTNAME is != 5 chars (1 char, 100 chars, doesn't make a  
difference), the query plan looks like this:
tii=# explain SELECT m_object_paper.id FROM m_object_paper,  
m_assignment WHERE m_object_paper.assignment = m_assignment.id AND  
m_object_paper.owner=-1 AND m_assignment.class = 2450798 AND  
lower(btrim(x_firstname)) = lower(btrim('Jordan')) and  
lower(btrim(x_lastname)) = lower(btrim('Smithers'));
                                             QUERY PLAN
---------------------------------------------------------------------------------------------------
  Nested Loop  (cost=0.00..10141.06 rows=1 width=4)
    ->  Index Scan using last_name_fnc_idx on m_object_paper   
(cost=0.00..10114.24 rows=11 width=8)
          Index Cond: (lower(btrim((x_lastname)::text)) =  
'smithers'::text)
          Filter: ((owner = (-1)) AND  
(lower(btrim((x_firstname)::text)) = 'jordan'::text))
    ->  Index Scan using m_assignment_pkey on m_assignment   
(cost=0.00..2.43 rows=1 width=4)
          Index Cond: (m_assignment.id = m_object_paper.assignment)
          Filter: (m_assignment.class = 2450798)
(7 rows)

In practice, the difference is 300+ seconds when $LASTNAME == 5 chars  
and <1 second when $LASTNAME != 5 chars.

Would anyone know what's going on here?  Is there something about the  
way btrim works, or perhaps with the way indexes are created?  It's  
strange that the query plan would change for just one case ("Jones,"  
"Smith," "Brown," etc., all cause the query plan to use that extra  
heap scan).

Thanks for any help!
--Richard

Responses

pgsql-performance by date

Next:From: Merlin MoncureDate: 2008-12-09 20:07:04
Subject: Re: Need help with 8.4 Performance Testing
Previous:From: Joshua D. DrakeDate: 2008-12-09 18:11:25
Subject: Re: Need help with 8.4 Performance Testing

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group