Re: PATCH: decreasing memory needlessly consumed by array_agg

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To:
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PATCH: decreasing memory needlessly consumed by array_agg
Date: 2015-01-14 21:52:58
Message-ID: 54B6E53A.5040808@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12.1.2015 01:28, Ali Akbar wrote:
>
> > Or else we implement what you suggest below (more comments below):
> >
> > Thinking about the 'release' flag a bit more - maybe we
> could do
> > this
> > instead:
> >
> > if (release && astate->private_cxt)
> > MemoryContextDelete(astate->mcontext);
> > else if (release)
> > {
> > pfree(astate->dvalues);
> > pfree(astate->dnulls);
> > pfree(astate);
> > }
> >
> > i.e. either destroy the whole context if possible, and
> just free the
> > memory when using a shared memory context. But I'm afraid
> this would
> > penalize the shared memory context, because that's
> intended for
> > cases where all the build states coexist in parallel and
> then at some
> > point are all converted into a result and thrown away.
> Adding pfree()
> > calls is no improvement here, and just wastes cycles.
> >
> >
> > As per Tom's comment, i'm using "parent memory context" instead of
> > "shared memory context" below.
> >
> > In the future, if some code writer decided to use
> subcontext=false,
> > to save memory in cases where there are many array
> accumulation, and
> > the parent memory context is long-living, current code can cause
> > memory leak. So i think we should implement your suggestion
> > (pfreeing astate), and warn the implication in the API
> comment. The
> > API user must choose between release=true, wasting cycles but
> > preventing memory leak, or managing memory from the parent memory
> > context.
>
> I'm wondering whether this is necessary after fixing makeArrayResult to
> use the privat_cxt flag. It's still possible to call makeMdArrayResult
> directly (with the wrong 'release' value).
>
> Another option might be to get rid of the 'release' flag altogether, and
> just use the 'private_cxt' - I'm not aware of a code using release=false
> with private_cxt=true (e.g. to build the same array twice from the same
> astate). But maybe there's such code, and another downside is that it'd
> break the existing API.
>
> > In one possible use case, for efficiency maybe the caller will
> > create a special parent memory context for all array accumulation.
> > Then uses makeArrayResult* with release=false, and in the end
> > releasing the parent memory context once for all.
>
> Yeah, although I'd much rather not break the existing code at all. That
> is - my goal is not to make it slower unless absolutely necessary (and
> in that case the code may be fixed per your suggestion). But I'm not
> convinced it's worth it.
>
>
> OK. Do you think we need to note this in the comments? Something like
> this: If using subcontext=false, the caller must be careful about memory
> usage, because makeArrayResult* will not free the memory used.

Yes, I think it's worth mentioning.

> But I think it makes sense to move the error handling into
> initArrayResultArr(), including the get_element_type() call, and remove
> the element_type from the signature. This means initArrayResultAny()
> will call the get_element_type() twice, but I guess that's negligible.
> And everyone who calls initArrayResultArr() will get the error handling
> for free.
>
> Patch v7 attached, implementing those two changes, i.e.
>
> * makeMdArrayResult(..., astate->private_cxt)
> * move error handling into initArrayResultArr()
> * remove element_type from initArrayResultArr() signature
>
>
> Reviewing the v7 patch:
> - applies cleanly to current master. patch format, whitespace, etc is good
> - make check runs without error
> - performance & memory usage still consistent
>
> If you think we don't have to add the comment (see above), i'll mark
> this as ready for committer

Attached is v8 patch, with a few comments added:

1) before initArrayResult() - explaining when it's better to use a
single memory context, and when it's more efficient to use a
separate memory context for each array build state

2) before makeArrayResult() - explaining that it won't free memory
when allocated in a single memory context (and that a pfree()
has to be used if necessary)

3) before makeMdArrayResult() - explaining that it's illegal to use
release=true unless using a subcontext

Otherwise the v8 patch is exactly the same as v7. Assuming the comments
make it sufficiently clear, I agree with marking this as 'ready for
committer'.

regards

--
Tomas Vondra http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
array-agg-v8.patch text/x-diff 14.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2015-01-14 21:57:49 Re: pg_rewind in contrib
Previous Message Peter Eisentraut 2015-01-14 21:48:53 Re: orangutan seizes up during isolation-check