Quick Links

Re: sequential scan performance

From:	Steve Atkins <steve(at)blighty(dot)com>
To:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: sequential scan performance
Date:	2005-05-30 16:53:40
Message-ID:	20050530165340.GA21210@gp.word-to-the-wise.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

On Sun, May 29, 2005 at 08:27:26AM -0500, Michael Engelhart wrote:
> Hi -
>
> I have a table of about 3 million rows of city "aliases" that I need
> to query using LIKE - for example:
>
> select * from city_alias where city_name like '%FRANCISCO'
>
>
> When I do an EXPLAIN ANALYZE on the above query, the result is:
>
> Seq Scan on city_alias (cost=0.00..59282.31 rows=2 width=42)
> (actual time=73.369..3330.281 rows=407 loops=1)
> Filter: ((name)::text ~~ '%FRANCISCO'::text)
> Total runtime: 3330.524 ms
> (3 rows)
>
>
> this is a query that our system needs to do a LOT. Is there any way
> to improve the performance on this either with changes to our query
> or by configuring the database deployment? We have an index on
> city_name but when using the % operator on the front of the query
> string postgresql can't use the index .

If that's really what you're doing (the wildcard is always at the beginning)
then something like this

create index city_name_idx on foo (reverse(city_name));

select * from city_alias where reverse(city_name) like reverse('%FRANCISCO');

should do just what you need.

I use this, with a plpgsql implementation of reverse, and it works nicely.

CREATE OR REPLACE FUNCTION reverse(text) RETURNS text AS '
DECLARE
original alias for $1;
reverse_str text;
i int4;
BEGIN
reverse_str = '''';
FOR i IN REVERSE LENGTH(original)..1 LOOP
reverse_str = reverse_str || substr(original,i,1);
END LOOP;
return reverse_str;
END;'
LANGUAGE 'plpgsql' IMMUTABLE;

Someone will no doubt suggest using tsearch2, and you might want to
take a look at it if you actually need full-text search, but my
experience has been that it's too slow to be useful in production, and
it's not needed for the simple "leading wildcard" case.

Cheers,
Steve

In response to

sequential scan performance at 2005-05-29 13:27:26 from Michael Engelhart

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Steinar H. Gunderson	2005-05-30 16:54:47	Re: Postgresql and xeon.
Previous Message	Sebastian Böck	2005-05-30 16:36:39	Re: Index not used on join with inherited tables