Re: tsearch2 in PostgreSQL 8.3?

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
Subject: Re: tsearch2 in PostgreSQL 8.3?
Date: 2007-08-17 17:31:40
Message-ID: 200708171731.l7HHVeK24797@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Josh Berkus wrote:
> Folks,
>
> Here's something not to forget in this whole business: the present TSearch2
> implementation permits you to have a different tsvector configuration for
> each *row*, not just each column. That is, applications can be built with
> "per-cell" configs.
>
> I know of at least one out there: Ubuntu's Rosetta. I'm sure there are
> others.
>
> Therefore there are two cases we're trying to solve:
>
> (1) The simple case: someone wants to build a database with text search
> entirely in one UTF8 language. All vectors are in that language, and so are
> all queries. The user wants the simplest syntax possible.
>
> (2) The Rosetta case: different configs are used for each cell and all
> searches have to be language-qualified.
>
> In both cases, the databases need to backup and restore cleanly.
>
> >From this, I'd first of all say that I don't see the point of a Superuser
> default_tsvector_search_config. There are too many failure conditions with
> the default once you get away from the simplest case, so I don't see how
> setting it to Superuser-only protects anything. Might as well make it a
> userset and then it will be more useful.

Per my email yesterday, default_tsvector_search_config is _not_
super-user-only:

o default_text_search_config stays, not super-user-only, not set
in pg_dump output

> Unfortunately, the way I see it the only permanent solution for this is to
> alter the TSvector structure to include a config OID at the beginning of it.
> That doesn't sound like it's doable in time for 8.3, though; is there a way
> we could work around that until 8.4?

Oh, so you want the config inside each tsvector value. Interesting
idea.

> And why does this sound exactly like the issues we've had with per-column
> encodings and the currency type?

Yes, this is a very similar issue except we are trying to allow multiple
encodings.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2007-08-17 17:35:22 Re: tsearch still has external configuration files
Previous Message Marc G. Fournier 2007-08-17 17:28:13 Re: [HACKERS] Re: cvsweb busted (was Re: pgsql: Repair problems occurring when multiple RI updates have to be)