tuplesort Generation memory contexts don't play nicely with index builds

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: tuplesort Generation memory contexts don't play nicely with index builds
Date: 2022-06-29 00:59:52
Message-ID: CAApHDvrHQkiFRHiGiAS-LMOvJN-eK-s762=tVzBz8ZqUea-a_A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hackers,

I noticed while doing some memory context related work that since we
now use generation.c memory contexts for tuplesorts (40af10b57) that
tuplesort_putindextuplevalues() causes memory "leaks" in the
generation context due to index_form_tuple() being called while we're
switched into the state->tuplecontext.

I use the word "leak" here slightly loosely. It's only a leak due to
how generation.c uses no free lists to allow reuse pfree'd memory.

It looks like the code has been this way ever since 9f03ca915 (Avoid
copying index tuples when building an index.) That commit did add a
big warning at the top of index_form_tuple() that the function must be
careful to not leak any memory.

A quick fix would be just to add a new bool field, e.g, usegencxt to
tuplesort_begin_common() and pass that as false in all functions apart
from tuplesort_begin_heap(). That way we'll always be using an aset.c
context when we call index_form_tuple().

However, part of me thinks that 9f03ca915 is standing in the way of us
doing more in the future to optimize how we store tuples during sorts.
We might, one day, want to consider using a hand-rolled bump
allocator. If we ever do that we'd need to undo the work done by
9f03ca915.

Does anyone have any thoughts on this?

Here's a reproducer from the regression tests:

CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
-- Use uncompressed data stored in toast.
CREATE INDEX no_index_cleanup_idx ON no_index_cleanup(t);
ALTER TABLE no_index_cleanup ALTER COLUMN t SET STORAGE EXTERNAL;
INSERT INTO no_index_cleanup(i, t) VALUES (generate_series(1,30),
repeat('1234567890',269));
-- index cleanup option is ignored if VACUUM FULL
VACUUM (INDEX_CLEANUP TRUE, FULL TRUE) no_index_cleanup;

David

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2022-06-29 01:17:15 Re: Export log_line_prefix(); useful for emit_log_hook.
Previous Message Peter Geoghegan 2022-06-29 00:32:26 Re: First draft of the PG 15 release notes