Re: Performance of aggregates over set-returning functions

From: "John Smith" <sodgodofall(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Performance of aggregates over set-returning functions
Date: 2008-06-09 22:57:52
Message-ID: b88f0d670806091557m7be097fdyd2bafdf4b2706351@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Sorry for the long delay in following up on this suggestion. The
change Tom suggested fixed the performance problems I was seeing, but
I never ran the full regression suite on the modified code, as
everything in my performance tests seemed to indicate the bug was
fixed (i.e, no errors even with --cassert-enabled). When I recently
ran the regression suite on the "fixed" version of Postgres, the
"misc" test suite fails with the following error message: "ERROR:
cache lookup failed for type 2139062143". Is this a manifestation of
the problem where certain items are being freed too early? Any other
ideas as to what's going on here?

Thanks,
John

On Tue, Jan 8, 2008 at 8:51 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> "John Smith" <sodgodofall(at)gmail(dot)com> writes:
>>> It's pipelined either way. But int8 is a pass-by-reference data type,
>>> and it sounds like we have a memory leak for this case.
>
>> Thanks for your reply. How easy is it to fix this? Which portion of
>> the code should we look to change?
>
> I was just looking at that. The issue looks to me that nodeResult.c
> (and other plan node types that support SRFs in their targetlists)
> do this:
>
> /*
> * Check to see if we're still projecting out tuples from a previous scan
> * tuple (because there is a function-returning-set in the projection
> * expressions). If so, try to project another one.
> */
> if (node->ps.ps_TupFromTlist)
> {
> resultSlot = ExecProject(node->ps.ps_ProjInfo, &isDone);
> if (isDone == ExprMultipleResult)
> return resultSlot;
> /* Done with that source tuple... */
> node->ps.ps_TupFromTlist = false;
> }
>
> /*
> * Reset per-tuple memory context to free any expression evaluation
> * storage allocated in the previous tuple cycle. Note this can't happen
> * until we're done projecting out tuples from a scan tuple.
> */
> ResetExprContext(econtext);
>
> whereas there would be no memory leak if these two chunks of code were
> in the other order. The question is whether resetting the context first
> would throw away any data that we *do* still need for the repeated
> ExecProject calls. That second comment block implies there's something
> we do need.
>
> I'm not sure why it's like this. Some digging in the CVS history shows
> that indeed the code used to be in the other order, and I switched it
> (and added the second comment block) in this old commit:
>
> http://archives.postgresql.org/pgsql-committers/2000-08/msg00218.php
>
> I suppose that the SQL-function support at the time required that its
> calling memory context be persistent until it returned ExprEndResult,
> but I sure don't recall any details. It's entirely possible that that
> requirement no longer exists, or could easily be eliminated given all
> the other changes that have happened since then. nodeFunctionscan.c
> seems to reset the current context for each call of a SRF, so I'd think
> that anything that can't cope with that should have been flushed out
> by now.
>
> If you feel like poking at this, I *strongly* recommend doing your
> testing in an --enable-cassert build. You'll have no idea whether you
> freed stuff too early if you don't have CLOBBER_FREED_MEMORY enabled.
>
> regards, tom lane
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2008-06-10 00:17:24 Re: Performance of aggregates over set-returning functions
Previous Message Scott Marlowe 2008-06-08 06:09:19 Re: Optimizing AGE()