Quick Links

Re: Text Search zero padding

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Richard Greenwood" <richard(dot)greenwood(at)gmail(dot)com>
Cc:	pgsql-general(at)postgresql(dot)org
Subject:	Re: Text Search zero padding
Date:	2008-02-29 05:19:05
Message-ID:	22411.1204262345@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

"Richard Greenwood" <richard(dot)greenwood(at)gmail(dot)com> writes:
> I am using text search across multiple columns. Two of the columns
> have values that have zero padding - sort of. The values look like
> R0001234 (1 char followed by 7 digits, zero padded). Users are
> accustom to searching with and without the zero padding (entering
> R0001234 or R1234 should return identical results). This is easy to
> accommodate when parsing user input for a single column, but text
> searching across multiple columns it is harder determine if a
> char/digit group should be padded.

> So far my best idea is to create a tsvector column containing both
> padded and non-padded versions of the value. i.e. put both R1234 and
> R0001234 into the tsvector column. This seems pretty brute force, and
> I am pretty new to text search, so I'd welcome any suggestions.

I'm not an expert in tsearch either, but given what you say here,
it seems like the Right Thing is to create a parser or dictionary
that strips those zeroes as being insignificant, so that R0001234 and
R1234 get mapped to the same stored/searchable lexeme.

regards, tom lane

In response to

Text Search zero padding at 2008-02-29 03:40:34 from Richard Greenwood

Browse pgsql-general by date

	From	Date	Subject
Next Message	Devi	2008-02-29 05:21:04	Re: rule question
Previous Message	Richard Greenwood	2008-02-29 03:40:34	Text Search zero padding