Re: Searching a gin index

From: James Dooley <jamdooley(at)gmail(dot)com>
To: Richard Huxton <dev(at)archonet(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Searching a gin index
Date: 2009-02-06 14:45:21
Message-ID: c01aff630902060645x10007796wc88730e8a8476bc6@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Oleg, but I am only interested in whether or not the syntax of my
search-query is correct.

Having created the index as I mentioned above, would the correct way of
searching and using that index be

... AND (title || '' || description || '' || name) @@ plainto_tsquery('car')

or should it be as Richard just mentioned

... AND to_tsvector(title || '' || description || '' || name) @@
plainto_tsquery('car')

or some other way ?

On Fri, Feb 6, 2009 at 3:30 PM, Richard Huxton <dev(at)archonet(dot)com> wrote:

> James Dooley wrote:
> > Hi again,
> >
> > I have set my configuration as default and I have created a GIN index on
> > three columns, like this
> >
> > create index textsearch_index on products using gin(strip( to_tsvector(
> > 'my_config', title || '' || description || '' || name)))
> >
> > Searching these columns the way I have
> >
> > ... AND (title || '' || description || '' || name) @@
> plainto_tsquery('car')
> >
> > seems not to be correct since it's taking as much time as non-indexed.
>
> PG's planner isn't smart enough to transform a complex expression so as
> to use a functional index (which is what you've got). You need to
> mention the function explicitly.
>
> So, if you had:
> CREATE INDEX lowercase_idx ON mytable ( lower(mycolumn) )
> You then need to search against "lower(mycolumn)" and not just expect
> the planner to notice that mycolumn="abc" could use the index.
>
> The following should work for you as a starting point:
>
> CREATE TABLE tsearch_tbl (id SERIAL, title text, body text,
> PRIMARY KEY (id));
>
> INSERT INTO tsearch_tbl (title, body)
> SELECT 'title number ' || n, 'This is body number ' || n
> FROM generate_series(1,9999) n;
>
> ANALYSE tsearch_tbl;
>
> CREATE INDEX tsearch_tbl_words_idx ON tsearch_tbl USING gin (
> to_tsvector('english', title || body) );
>
> EXPLAIN ANALYSE SELECT * FROM tsearch_tbl WHERE to_tsvector('english',
> title || body) @@ to_tsquery('17');
> QUERY PLAN
>
>
> -------------------------------------------------------------------------------------------------------------------------------
> Bitmap Heap Scan on tsearch_tbl (cost=4.34..34.76 rows=10 width=45)
> (actual time=0.067..0.067 rows=1 loops=1)
> Recheck Cond: (to_tsvector('english'::regconfig, (title || body)) @@
> to_tsquery('17'::text))
> -> Bitmap Index Scan on tsearch_tbl_words_idx (cost=0.00..4.34
> rows=10 width=0) (actual time=0.059..0.059 rows=1 loops=1)
> Index Cond: (to_tsvector('english'::regconfig, (title || body))
> @@ to_tsquery('17'::text))
> Total runtime: 0.121 ms
>
>
> Note that you'll have problems if any of your text-fields contain nulls
> (since null || anything = null).
>
> Personally, unless I'm dealing with a large table, I like to have a
> separate tsvector column which I keep up to date with triggers. It makes
> it easier to debug problems.
>
> --
> Richard Huxton
> Archonet Ltd
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Richard Huxton 2009-02-06 15:01:44 Re: Searching a gin index
Previous Message justin 2009-02-06 14:30:44 Re: How do I turn on query logger?