Re: default_text_search_config and expression indexes

From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Michael Paesold <mpaesold(at)gmx(dot)at>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Gregory Stark <stark(at)enterprisedb(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: default_text_search_config and expression indexes
Date: 2007-08-01 04:31:03
Message-ID: Pine.LNX.4.64.0708010756270.18739@sn.sai.msu.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-advocacy pgsql-hackers

On Tue, 31 Jul 2007, Bruce Momjian wrote:

> Oleg Bartunov wrote:
>> On Tue, 31 Jul 2007, Bruce Momjian wrote:
>>
>>>>> And if we have to require the configuration name in CREATE INDEX, it has
>>>>> to be used in WHERE, so we might as well just remove the default
>>>>> capability and always require the configuration name.
>>>>
>>>> this is very rare use case for text searching
>>>> 1. expression index without configuration name
>>>> 2. default_text_search_config can be changed by somebody
>>>
>>> If you are going to be using the configuration name with the create
>>> expression index, you have to use it in the WHERE clause (or the index
>>> doesn't work), and I assume that is 90% of the text search uses. I
>>> don't see it as rare at all.
>>
>> What is a basis of your assumption ? In my opinion, it's very limited
>> use of text search, because it doesn't supports ranking. For 4-5 years
>> of tsearch2 usage I never used it and I never seem in mailing lists.
>> This is very user-oriented feature and we could probably ask
>> -general people for their opinion.
>
> I doubt 'general' is going to understand the details of merging this
> into the backend. I assume we have enough people on hackers to decide
> this.

I mean not technical details, but use case. Does they need expressional
index without ranking but sacrifice ability to use default configuration
in other cases too ? My prediction is that people doesn't ever thought about
this possibility until we said them about.

>
> Are you saying the majority of users have a separate column with a
> trigger? Does the trigger specify the configuation? I don't see that
> as a parameter argument to tsvector_update_trigger(). If you reload a
> pg_dump, what does it use for the configuration?
>

yes, separate column with custom trigger works fine. It's up to you how
to keep your data actual and it's up to you how to write trigger.
Our tsvector_update_trigger() is a tsvector_update_trigger_example() !

> Why is a separate column better than the index? Just ranking?

ranking + composite documents. I already mentioned, that this could be
rather expensive. Also, having separate column allow people various
ways to say what is a document and even change it.

>
> The reason the expression index is nice is this feature has to be easy
> to use for people who are new to full text and even PostgreSQL. Right
> now /contrib is fine for experts to use, but we want a larger user base
> for this feature.

I agree here. This was one of the main reason of our work for 8.3.
Probably, we shold think in another direction - not to curtail tsearch2
and confuse rather big existing users, but to add an ability to save somehow
configuration used for creating of *document*
either implicitly (in expression index, or just gin(text_column)), or
explicitly (separate column). There is no problem with index itself !

>
>>
>> I'd better say we don't support text searching using expression index
>> than remove default_text_search_config. Anyway, I don't feel myself
>> responisble for such important problem. We need more feedback from
>> users.
>
> Well, I am waiting for other hackers to get involved, but if they don't,
> I have to evaluate it myself on the email lists.
>
>>> If we are going to keep it, I need someone to explain why my comments
>>> above are wrong. If I am right, someone has to remove
>>> default_text_search_config from the patch. I can do the documentation.
>>
>> I'm in conference and then will be busy writing my applications and
>> earning money, Teodor is in vacation. I don't want to do
>> hasty conclusion, since we're very tired to change our patch from
>> one solution to another. We need consensus of developers and users.
>> I'm almost exhausted and have no time to continue this discussion.
>>
>> Would you be so kind to write separate post about this problem and
>> call -hackers and -general for feedback. Let's experienced users
>> show their needs. We said everything and has nothing to add.
>
> If you have no time to continue discussion and perhaps update the patch,
> we can consider this patch dead for 8.3 and we can hold it for 8.4
> because I can guarantee you this is going to need more discussion and
> patch modification before it gets into CVS.
>
> This patch is being treated fairly and exactly the same as every other
> patch.

why do you say this ? I didn't complain about this.

>
> Should we hold the patch for 8.4?

If we're not agree to say in docs, that implicit usage of text search
configuration in CREATE INDEX command doesn't supported. Could we leave
default_text_search_config for super-users, at least ?

Anyway, let's wait what other people say.

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

In response to

Responses

Browse pgsql-advocacy by date

  From Date Subject
Next Message Greg Smith 2007-08-01 04:52:56 Re: postgresql publication
Previous Message Kevin Hunter 2007-08-01 02:48:53 Re: postgresql publication

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2007-08-01 04:39:54 Re: GIT patch
Previous Message Rafael Azevedo 2007-08-01 03:13:57 Re: feature suggestion