Re: Fix receiving large legal tsvector from binary format

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Ерохин Денис Владимирович <erohin-d(at)datagile(dot)ru>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Fix receiving large legal tsvector from binary format
Date: 2023-10-01 17:24:01
Message-ID: 1842217.1696181041@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

=?koi8-r?B?5dLPyMnOIOTFzsnTIPfMwcTJzcnSz9fJ3g==?= <erohin-d(at)datagile(dot)ru> writes:
> There is a problem on receiving large tsvector from binary format with
> getting error "invalid tsvector: maximum total lexeme length exceeded".

Good catch! Even without an actual failure, we'd be wasting space
on-disk anytime we stored a tsvector received through binary input.

I pushed your 0001 and 0002, but I don't really agree that 0003
is an improvement. It looks to me like it'll result in one
repalloc per lexeme, instead of the O(log N) behavior we had before.
It's not that important to keep the palloc chunk size small here,
given that we don't allow tsvectors to get anywhere near 1Gb anyway.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2023-10-01 18:41:53 Re: Evaluate arguments of correlated SubPlans in the referencing ExprState
Previous Message David G. Johnston 2023-10-01 16:05:49 Re: Skip Orderby Execution for Materialized Views