Re: tsearch thoughts

From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)stack(dot)net>
Subject: Re: tsearch thoughts
Date: 2002-12-01 10:47:47
Message-ID: Pine.GSO.4.44.0212011338120.2857-100000@ra.sai.msu.su
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, 30 Nov 2002, Christopher Kings-Lynne wrote:

> Is there any reason why the tseach indexes couldn't be modified to just work
> on TEXT fields and not TXTIDX fields. Is there really a reason to have the
> TXTIDX type?
>
> I mean, when the index is created over the text column, instead of just
> indexing the text as-is, index the txt2txtidx'd version...?
>
> That would vastly reduce the complexity of tsearch, and would make the
> indexed text invisible, as it is in most other fti implementations...?

Chris,

This is sort of we had thought about full text searching in postgres and
what should happens with maturity of tsearch. We began from contrib/module
just to get some experience and still need to do some research on
underlying algorithms. Also, remember current GiST is not concurrent and
we plan to work on this issue. We're very busy and need somebody to help
us with interface (dictionaries, parser, postgresql internal interface).

>
> I tried to simulate this myself, although ideally it would be invisible to
> the user:
>
> test=# create table test (a text);
> CREATE
> test=# CREATE INDEX my_idx ON test USING gist(txt2txtidx(a));
> ERROR: DefineIndex: index function must be marked iscachable
>
> So the index isn't iscachable - why's that?

I don't remember the reason, but you may try to define it as 'iscachable'
in tsearch.sql.

>
> Say it was marked iscachable, then I'd be able to query like this:
>
> SELECT * FROM test WHERE txt2txtidx(test) ## 'apple';
>
> This would mean that the index on-disk file would be large, but the table
> file would stay small. It would also vastly reduce the size of pg_dumps...
>
> Could we move towards something like:
>
> CREATE FULLTEXT INDEX my_idx ON test (a);
>
> Or something?
>
> Chris
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Florian Weimer 2002-12-01 12:50:25 Re: 7.4 Wishlist
Previous Message Stephan Szabo 2002-12-01 10:33:19 Re: 7.4 Wishlist