BUG #17557: ts_headline will error with "invalid memory alloc request size" for large documents

From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: magicagent(at)gmail(dot)com
Subject: BUG #17557: ts_headline will error with "invalid memory alloc request size" for large documents
Date: 2022-07-22 15:39:42
Message-ID: 17557-6ddc074c8b1bd6df@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 17557
Logged by: Alex Malek
Email address: magicagent(at)gmail(dot)com
PostgreSQL version: 14.4
Operating system: Red Hat
Description:

ts_headline when given a documents over a certain size/number of words will
cause "ERROR: invalid memory alloc request size XXXXXX"

# select ts_headline('b ' || repeat('1 ',16777215), $$'b'$$::tsquery,
'MaxWords=4, MinWords=3') ;
ERROR: invalid memory alloc request size 1610612736

Not just related to document size but also to number of "words" in a
document:

One less "word" works:

select ts_headline('b ' || repeat('1 ',16777214), $$'b'$$::tsquery,
'MaxWords=4, MinWords=3') ;
ts_headline
----------------
<b>b</b> 1 1 1
(1 row)

Mem not an issue for larger "words" up to a point:

# select ts_headline('b ' || repeat('123456789012345 ',16777214),
$$'b'$$::tsquery, 'MaxWords=4, MinWords=3') ;
ts_headline
----------------------------------------------------------
<b>b</b> 123456789012345 123456789012345 123456789012345
(1 row)

# select ts_headline('b ' || repeat('1234567890123456 ',16777214),
$$'b'$$::tsquery, 'MaxWords=4, MinWords=3') ;
ERROR: invalid memory alloc request size 1140850564

Memory issue appears to be triggered based on total number of words and word
length

# select ts_headline('b ' || repeat('1234567890123456 ',15790000),
$$'b'$$::tsquery, 'MaxWords=4, MinWords=3') ;
ts_headline
-------------------------------------------------------------
<b>b</b> 1234567890123456 1234567890123456 1234567890123456
(1 row)

# select ts_headline('b ' || repeat('1234567890123456 ',15795000),
$$'b'$$::tsquery, 'MaxWords=4, MinWords=3') ;
ERROR: invalid memory alloc request size 1074060012

I get the same results even when increasing psql GUCs including work_mem,
shared_buffers and effective_cache_size
Also on machines w/ significantly more RAM, with and w/o HugePages enabled.

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2022-07-22 16:50:29 Re: If a row-level security policy contains a set returning function, pg_dump returns an incorrect serialization of that policy if the return type of the function was altered
Previous Message PG Bug reporting form 2022-07-22 14:06:43 BUG #17556: ts_headline does not correctly find matches when separated by 4,999 words