Skip site navigation (1) Skip section navigation (2)

Re: sequential scan performance

From: Steve Atkins <steve(at)blighty(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: sequential scan performance
Date: 2005-05-30 16:53:40
Message-ID: 20050530165340.GA21210@gp.word-to-the-wise.com (view raw or flat)
Thread:
Lists: pgsql-performance
On Sun, May 29, 2005 at 08:27:26AM -0500, Michael Engelhart wrote:
> Hi -
> 
> I have a table of about 3 million rows of city "aliases" that I need  
> to query using LIKE - for example:
> 
> select * from city_alias where city_name like '%FRANCISCO'
> 
> 
> When I do an EXPLAIN ANALYZE on the above query, the result is:
> 
>  Seq Scan on city_alias  (cost=0.00..59282.31 rows=2 width=42)  
> (actual time=73.369..3330.281 rows=407 loops=1)
>    Filter: ((name)::text ~~ '%FRANCISCO'::text)
> Total runtime: 3330.524 ms
> (3 rows)
> 
> 
> this is a query that our system needs to do a LOT.   Is there any way  
> to improve the performance on this either with changes to our query  
> or by configuring the database deployment?   We have an index on  
> city_name but when using the % operator on the front of the query  
> string postgresql can't use the index .

If that's really what you're doing (the wildcard is always at the beginning)
then something like this

  create index city_name_idx on foo (reverse(city_name));

  select * from city_alias where reverse(city_name) like reverse('%FRANCISCO');

should do just what you need.

I use this, with a plpgsql implementation of reverse, and it works nicely.

CREATE OR REPLACE FUNCTION reverse(text) RETURNS text AS '
DECLARE
       original alias for $1;
       reverse_str text;
       i int4;
BEGIN
 reverse_str = '''';
 FOR i IN REVERSE LENGTH(original)..1 LOOP
  reverse_str = reverse_str || substr(original,i,1);
 END LOOP;
 return reverse_str;
END;'
LANGUAGE 'plpgsql' IMMUTABLE;


Someone will no doubt suggest using tsearch2, and you might want to
take a look at it if you actually need full-text search, but my
experience has been that it's too slow to be useful in production, and
it's not needed for the simple "leading wildcard" case.

Cheers,
  Steve

In response to

pgsql-performance by date

Next:From: Steinar H. GundersonDate: 2005-05-30 16:54:47
Subject: Re: Postgresql and xeon.
Previous:From: Sebastian BöckDate: 2005-05-30 16:36:39
Subject: Re: Index not used on join with inherited tables

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group