Re: ts_headline

From: Richard Huxton <dev(at)archonet(dot)com>
To: Stephen Davies <scldad(at)sdc(dot)com(dot)au>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: ts_headline
Date: 2008-02-22 09:30:18
Message-ID: 47BE962A.8050403@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-patches

Stephen Davies wrote:
> Not quite:-(
>
> It is the ts_headline with the explicit "english" configuration that "fails"
> rather than the implicit "simple".

Hmm... arse.

> That's what is so weird.
>
> As you say, the ts_vector has "databas" so the "english" version of
> ts_headline should work - but it doesn't. The "simple" version does; despite
> the above.

[goes away, tests some more]

OK, so:

set default_text_search_config = 'simple';
SELECT ts_headline('my database is a database', to_tsquery('database'));
SELECT ts_headline('my database is a database', to_tsquery('simple',
'database'));
SELECT ts_headline('my database is a database', to_tsquery('english',
'database'));

The first two work, the last one doesn't.

set default_text_search_config = 'english';
SELECT ts_headline('my database is a database', to_tsquery('database'));
SELECT ts_headline('my database is a database', to_tsquery('simple',
'database'));
SELECT ts_headline('my database is a database', to_tsquery('english',
'database'));

The middle one doesn't work.

Note that there are no indexes involved here, we're just running against
the raw text.

[light goes on over sluggish London-based database chap]

When the ts_headline function is working on the text, it needs to
convert it from varchar/text type to tsvector so that it can use the
tsquery to find words to highlight.

When it converts the text to a tsvector, it's doing it based on
default_text_search_config - we've not told it otherwise. In an ideal
world, it would look "inside" the tsquery and see what config that was
using, but it can't (or at least doesn't).

Of course, if to_tsquery()'s config doesn't match to_tsheadline()'s then
we get a problem.

And, if I actually bother to read an up-to-date copy of the manual,
rather than the beta version I've got linked on my desktop I can see
there's a parameter for ts_headline. So...

set default_text_search_config = 'simple';
SELECT ts_headline('english', 'my database is a database',
to_tsquery('english','database')
);

set default_text_search_config = 'english';
SELECT ts_headline('simple', 'my database is a database',
to_tsquery('simple','database')
);

These all work fine. Phew!

--
Richard Huxton
Archonet Ltd

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Stephen Davies 2008-02-22 09:55:55 Re: ts_headline
Previous Message Stephen Davies 2008-02-22 09:10:39 Re: ts_headline

Browse pgsql-patches by date

  From Date Subject
Next Message Peter Eisentraut 2008-02-22 09:44:07 Re: fix in --help output
Previous Message Stephen Davies 2008-02-22 09:10:39 Re: ts_headline