Re: BUG #14654: With high statistics targets on ts_vector, unexpectedly high memory use & OOM are triggered

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: james+postgres(at)carbocation(dot)com, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #14654: With high statistics targets on ts_vector, unexpectedly high memory use & OOM are triggered
Date: 2017-07-12 11:32:11
Message-ID: dc65ba89-46d1-07f2-3f94-51ba00446931@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 05/14/2017 11:06 PM, james+postgres(at)carbocation(dot)com wrote:
> It seems that ANALYZE on a ts_vector column can consume 300 * (statistics
> target) * (size of data in field), which in my case ended up being well
> above 10 gigabytes. I wonder if this might be considered a bug (either in
> code, or of documentation), as this memory usage seems not to obey other
> limits, or at least wasn't documented in a way that might have helped me
> guess at the underlying problem.

Yes, I can see that happening here too. The problem seems to be that the
analyze-function detoasts every row in the sample. Tsvectors can be very
large, so it adds up.

That's pretty easy to fix, the analyze function needs to free the
detoasted copies as it goes. But in order to do that, it needs to make
copies of all the lexemes stored in the hash table, instead of pointing
directly to the detoasted copies.

Patch attached. I think this counts as a bug, and we should backport this.

- Heikki

Attachment Content-Type Size
reduce-tsvector-analyze-memory-usage.patch text/x-diff 2.1 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Heikki Linnakangas 2017-07-12 12:45:48 Re: BUG #14721: Assertion of synchronous replication
Previous Message K S, Sandhya (Nokia - IN/Bangalore) 2017-07-12 11:20:58 Re: [HACKERS] Postgres process invoking exit resulting in sh-QUIT core