Using indexes for like foo% type queries when foo isn't constant (not a locale issue)

From: Greg Stark <gsstark(at)mit(dot)edu>
To: pgsql-performance(at)postgresql(dot)org
Subject: Using indexes for like foo% type queries when foo isn't constant (not a locale issue)
Date: 2003-04-15 19:04:04
Message-ID: 87of37oa57.fsf@stark.dyndns.tv
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


In the following query postgres doesn't use the index. In the hard-coded
version below it does. I suppose it can't because it's possible the "target"
could have wildcards etc in them. Is there any way to indicate the postgres
that that won't happen?

This is going to be even more of an issue when preparsed queries happen
because even in the hard coded example it will be an issue. I know in Oracle
if you parse a query with a LIKE :1||'%' type expression it still plans to use
the index and that's extremely useful. I don't know what it does if there's a
% in the parameter, it either takes the performance hit or it doesn't treat
them as special?

db=> explain analyze select postalcode, abs(substr(target,6,1)::integer-substr(postalcode,6,1)::integer) as dist from postalcodes, (select 'L6C2M6'::text as target) as t where postalcode like substr(target,1,5)||'%' order by dist asc limit 2;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=12182.77..12182.77 rows=2 width=42) (actual time=9226.17..9226.18 rows=2 loops=1)
-> Sort (cost=12182.77..12186.16 rows=1359 width=42) (actual time=9226.16..9226.16 rows=2 loops=1)
Sort Key: abs(((substr(t.target, 6, 1))::integer - (substr((postalcodes.postalcode)::text, 6, 1))::integer))
-> Nested Loop (cost=0.00..12112.04 rows=1359 width=42) (actual time=3262.89..9205.25 rows=8 loops=1)
Join Filter: ("inner".postalcode ~~ (substr("outer".target, 1, 5) || '%'::text))
-> Subquery Scan t (cost=0.00..0.01 rows=1 width=0) (actual time=0.04..0.05 rows=1 loops=1)
-> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.02..0.02 rows=1 loops=1)
-> Seq Scan on postalcodes (cost=0.00..7335.69 rows=271769 width=10) (actual time=5.52..3268.74 rows=271769 loops=1)
Total runtime: 9241.92 msec
(9 rows)

db=> explain analyze select postalcode, abs(substr('L6C2M6',6,1)::integer-substr(postalcode,6,1)::integer) as dist from postalcodes where postalcode like substr('L6C2M6',1,5)||'%' order by dist asc limit 2;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=3.29..3.29 rows=1 width=10) (actual time=36.54..36.55 rows=2 loops=1)
-> Sort (cost=3.29..3.29 rows=1 width=10) (actual time=36.53..36.54 rows=2 loops=1)
Sort Key: abs((6 - (substr((postalcode)::text, 6, 1))::integer))
-> Index Scan using idx_postalcodes_postalcodeon on postalcodes (cost=0.00..3.28 rows=1 width=10) (actual time=35.91..36.33 rows=8 loops=1)
Index Cond: ((postalcode >= 'L6C2M'::bpchar) AND (postalcode < 'L6C2N'::bpchar))
Filter: (postalcode ~~ 'L6C2M%'::text)
Total runtime: 36.93 msec
(7 rows)

--
greg

Browse pgsql-performance by date

  From Date Subject
Next Message Andreas Pflug 2003-04-15 19:55:20 Re: Do Views offer any performance advantage?
Previous Message Tom Lane 2003-04-15 17:26:23 Re: Do Views offer any performance advantage?