Re: [BUGS] BUG #3975: tsearch2 index should not bomb out of 1Mb limit

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Euler Taveira de Oliveira <euler(at)timbira(dot)com>, Edwin Groothuis <postgresql(at)mavetju(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: [BUGS] BUG #3975: tsearch2 index should not bomb out of 1Mb limit
Date: 2008-05-07 04:12:36
Message-ID: 200805070412.m474Ca606609@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-patches


Added to TODO:

o Consider changing error to warning for strings larger than one
megabyte

http://archives.postgresql.org/pgsql-bugs/2008-02/msg00190.php
http://archives.postgresql.org/pgsql-patches/2008-03/msg00062.php

---------------------------------------------------------------------------

Tom Lane wrote:
> Bruce Momjian <bruce(at)momjian(dot)us> writes:
> > Tom Lane wrote:
> >> I don't think that follows. A tsearch index is lossy anyway, so there's
>
> > Uh, the index is lossy but I thought it was lossy in a way that just
> > required additional heap accesses, not lossy in that it doesn't index
> > everything.
>
> Sure it's lossy. It doesn't index stopwords, and it doesn't index the
> difference between various forms of a word (when the dictionaries reduce
> them to a common root).
>
> > I am concerned a 1mb limit is too low though. Exactly why can't we have
> > a higher limit? Is positional information that significant?
>
> That's pretty much exactly the point: it's not very significant, and it
> doesn't justify a total inability to index large documents.
>
> One thing we could do is index words that are past the limit but not
> store a position, or perhaps have the convention that the maximum
> position value means "somewhere past here".
>
> regards, tom lane
>
> --
> Sent via pgsql-patches mailing list (pgsql-patches(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-patches

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message dksingh.engineer 2008-05-07 08:37:13 BUG #4148: tsearch related issue
Previous Message Arun J M 2008-05-06 11:00:12 BUG #4147: pg_constraint table data retrieval error

Browse pgsql-patches by date

  From Date Subject
Next Message Bruce Momjian 2008-05-07 04:35:06 Re: [HACKERS] Re: a tsearch2 (8.2.4) dictionary that only filters out stopwords
Previous Message Greg Smith 2008-05-07 04:01:21 Re: [0/4] Proposal of SE-PostgreSQL patches