Re: pg_migrator and an 8.3-compatible tsvector data type

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: pg_migrator and an 8.3-compatible tsvector data type
Date: 2009-05-29 18:16:19
Message-ID: 200905291816.n4TIGJM23816@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:
> Josh Berkus <josh(at)agliodbs(dot)com> writes:
> > Bruce,
> >> The ordering of the lexems was changed:
>
> > What does that get us in terms of performance etc.?
>
> It was changed to support partial-match tsvector queries. Without it,
> a partial match query would have to scan entire tsvectors instead
> of applying binary search. I don't know if Oleg and Teodor did any
> actual performance tests on the size of the hit, but it seems like
> it could be pretty awful for large documents.

I started thinking about the performance issues of the tsvector changes.
Teodor gave me this code for conversion that basically does:

qsort_arg((void *) ARRPTR(t), t->size, sizeof(WordEntry), cmpLexeme, (void*) t);

So, basically, every time there is a cast we have to do a sort, which
for a large document would yield poor performance, and because we are
not storing the sorted result, it happens for every access; this might
be an unacceptable performance burden.

So, one idea would be, instead of a cast, have pg_migrator rebuild the
tsvector columns with ALTER TABLE, so then the 8.4 index code could be
used. But then we might as well just tell the users to migrate the
tsvector tables themselves, which is how pg_migrator behaves now.

Obviously we are still trying to figure out the best way to handle data
type changes; I think as soon as we figure out a plan for tsvector we
can use that method for future changes.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2009-05-29 18:16:26 Re: [GENERAL] trouble with to_char('L')
Previous Message Konstantin Izmailov 2009-05-29 17:55:23 Re: information_schema.columns changes needed for OLEDB