Re: ts_headline

From: Stephen Davies <scldad(at)sdc(dot)com(dot)au>
To: Richard Huxton <dev(at)archonet(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: ts_headline
Date: 2008-02-22 00:07:32
Message-ID: 200802221037.33055.scldad@sdc.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-patches

OK. The first level explanation is that my default config is "simple".
This explains the different query results as "english" reduces "database" to
"databas" while "simple does not reduce it at all.

The "document" is parsed/indexed using "english" explicitly so my queries nedd
to be explicit also (not an issue as all "real" queries are generated rather
than typed).

However, I still cannot see a reason for the ts_headline results. If anything,
they should be the other way around.

I suspect that ts_headline may only work properly when no configuration is
specified - regardless of the default setting.

Cheers,
Stephen

On Thursday 21 February 2008 22:30, Richard Huxton wrote:
> Stephen Davies wrote:
> > I just spotted the difference between your test and mine.
> >
> > My query says:
> >
> > select ts_headline(abstract,to_tsquery('english','database'),'minWords =
> > 99, maxWords = 999') from document where id=21;
> >
> > where your equivalent does not include the 'english' arg.
> >
> > If I take out the 'english' from this query, I get the same result as
> > you.
>
> What does this give you:
> show default_text_search_config;
> I get pg_catalog.english and the same result for the query whether I use:
> to_tsquery('english','database')
> or to_tsquery('pg_catalog.english','database')
>
> Could you be picking up a bad "english" configuration (see \dF)?
>
> > However, the following returns zero rows:
> >
> > select title,author,ts_headline(abstract,to_tsquery('database') from
> > document where clob @@ to_tsquery('database')
>
> I take it "clob" matches "abstract"?
>
> > It gets more interesting:
> >
> > select title,author,ts_headline(abstract,to_tsquery('database') from
> > document where clob @@ to_tsquery('english','database')
> >
> > returns the "correct" result - one row with the expected headline.
>
> Now that *is* strange. ts_headline() works without specifying 'english'
> but the actual search works the other way.
>
> > select
> > title,author,ts_headline(abstract,to_tsquery('english','thesaurus') from
> > document where clob @@ to_tsquery('english','thesaurus')
> >
> > also returns the "correct" result.
> >
> > I suggest that the above indicates a bug somewhere.
>
> Could be - it'd be good to rule out a bad config. You might have an
> unexpected list of stopwords or similar.
>
> Let's try:
> SELECT ts_debug('the database and thesaurus');
> SELECT ts_debug('english', 'the database and thesaurus');
> SELECT ts_debug('pg_catalog.english', 'the database and thesaurus');
> I'd expect "the", "and" to be stripped out as stopwords and the other
> two to get through (database stemmed to "databas").

--
========================================================================
This email is for the person(s) identified above, and is confidential to
the sender and the person(s). No one else is authorised to use or
disseminate this email or its contents.

Stephen Davies Consulting Voice: 08-8177 1595
Adelaide, South Australia. Fax: 08-8177 0133
Computing & Network solutions. Mobile:0403 0405 83

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Alex Turner 2008-02-22 00:49:29 Re: Postgres 8.3 broke everything
Previous Message Joshua D. Drake 2008-02-21 23:07:00 Re: Postgres 8.3 broke everything

Browse pgsql-patches by date

  From Date Subject
Next Message Euler Taveira de Oliveira 2008-02-22 02:52:05 Re: BUG #3975: tsearch2 index should not bomb out of 1Mb limit
Previous Message Zdenek Kotala 2008-02-21 22:22:52 Re: fix in --help output