Re: to_tsvector() with hyphens

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Brian DeRocher <brian(at)derocher(dot)org>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: to_tsvector() with hyphens
Date: 2015-07-06 16:36:02
Message-ID: 29462.1436200562@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Brian DeRocher <brian(at)derocher(dot)org> writes:
> But why does to_tsquery() AND them?

> rasmas_hackathon=> select * from to_tsquery( 'gn-foo | bandage' );
> to_tsquery
> ------------------------------------
> 'gn-foo' & 'gn' & 'foo' | 'bandag'
> (1 row)

Because what you're looking for is gn-foo, not either gn alone or foo
alone. Converting to "OR" would be the wrong thing.

> The rank is so bad.

> rasmas_hackathon=> select ts_rank_cd( to_tsvector( 'gn series bandage' ), to_tsquery( 'gn-foo | bandage' ) );
> ts_rank_cd
> ------------
> 0.1
> (1 row)

> Without the hyphen the rank is better, despite the process above.

> rasmas_hackathon=> select ts_rank_cd( to_tsvector( 'gn series bandage' ), to_tsquery( 'gn | bandage' ) );
> ts_rank_cd
> ------------
> 0.2
> (1 row)

Don't see the problem. The first case doesn't match the query as well as
the second one does, so I'd fully expect a higher rank for the second.

regards, tom lane

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2015-07-06 17:03:35 Re: [pg_hba.conf] publish own Python application using PostgreSQL
Previous Message Brian DeRocher 2015-07-06 16:30:27 to_tsvector() with hyphens