Re: How to search ignoring spaces and minus signs

From: Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
To: Andrus <kobruleht2(at)hot(dot)ee>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: How to search ignoring spaces and minus signs
Date: 2010-10-14 03:53:18
Message-ID: 4CB67EAE.40408@postnewspapers.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 14/10/10 01:45, Andrus wrote:
> CHAR(20) columns in 8.4 database may contains spaces and - signs like
>
> 13-333-333
> 12 3-44
> 33 33 333
> 12345
>
> User enters code to search without spaces and - signs, like 12344
> How to search for product code ignoring spaces and - signs?
>
> For example searching for code 12344 should return
> 12 3-44 as matching item.

Either convert the user input into a suitable regular expression, or
pre-process the column being searched to strip spaces and - signs.

I'd recommend writing a simple SQL function that uses a regexp_replace()
or a couple of regular replace() calls to simplify the column being
searched down to only numbers. If you want to strip *everything* that's
not a number, you could use:

CREATE OR REPLACE FUNCTION strip_nondigits(text) RETURNS text AS $$
SELECT regexp_replace($1, E'[^0-9]', '', 'g');
$$ LANGUAGE 'sql' IMMUTABLE STRICT;

See: http://www.postgresql.org/docs/current/static/functions-string.html

http://www.postgresql.org/docs/current/static/functions-matching.html#FUNCTIONS-POSIX-REGEXP

http://www.postgresql.org/docs/current/interactive/xfunc-sql.html

You can now create a functional index on your target column that uses
that function, allowing you to run indexed searches against the column
without repeatedly re-evaluating the stripping expression for every
value during every search. The query planner will recognise when you use
the indexed expression in a query, and will use the index where
appropriate. Make the index like this:

CREATE INDEX stripped_numbers ON mytable ((strip_nondigits(thecolumn));

then use it like this:

SELECT * FROM mytable WHERE strip_nondigits('123-user-input-here') =
strip_nondigits(thecolumn);

EXPLAIN should show that an index scan of stripped_numbers is being
used, at least if the table contains non-trivial amounts of data.

Alternately, you could use a trigger to maintain a stripped version of
the field as an additional column in each row. Both approaches cost you
a bit of time during updates/inserts though, and the functional index is
probably easier.

--
Craig Ringer

Tech-related writing: http://soapyfrogs.blogspot.com/

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Jayadevan M 2010-10-14 04:05:15 Re: How to search ignoring spaces and minus signs
Previous Message Craig Ringer 2010-10-14 03:41:55 Re: Copying data files to new hardware?