pgsql: Doc: remove bogus claim that tsvectors can have up to 2^64 entri

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-committers(at)lists(dot)postgresql(dot)org
Subject: pgsql: Doc: remove bogus claim that tsvectors can have up to 2^64 entri
Date: 2026-03-31 15:49:59
Message-ID: E1w7bLq-002KaL-1I@gemulon.postgresql.org
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-committers

Doc: remove bogus claim that tsvectors can have up to 2^64 entries.

This is nonsense on its face, since the textsearch parsing logic
generally uses int32 to count words (see, eg, struct ParsedText).
Not to mention that we don't support input strings larger than
1GB.

The actual limitation of interest is documented nearby: a tsvector
can't be larger than 1MB, thanks to 20-bit offset fields within it
(see WordEntry.pos). That constrains us to well under 256K lexemes
per tsvector, depending on how many positions are stored per lexeme.

It seems sufficient therefore to just remove the bit about number
of lexemes.

Author: Dharin Shah <dharinshah95(at)gmail(dot)com>
Discussion: https://postgr.es/m/CAOj6k6d0YO6AO-bhxkfUXPxUi-+YX9-doh2h5D5z0Bm8D2w=OA@mail.gmail.com

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/960382e3e991f774d0ef92eb82dd7ef641f74108

Modified Files
--------------
doc/src/sgml/textsearch.sgml | 5 -----
1 file changed, 5 deletions(-)

Browse pgsql-committers by date

  From Date Subject
Next Message Nathan Bossart 2026-03-31 17:44:31 pgsql: Avoid including vacuum.h in tableam.h and heapam.h.
Previous Message Tom Lane 2026-03-31 15:23:36 pgsql: Doc: improve explanation of GiST compress/decompress methods.