Re: Memory Accounting v11

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Memory Accounting v11
Date: 2015-07-15 01:14:26
Message-ID: 55A5B3F2.7070404@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 07/14/2015 10:19 PM, Robert Haas wrote:
> On Sat, Jul 11, 2015 at 2:28 AM, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
>> After talking with a few people at PGCon, small noisy differences
>> in CPU timings can appear for almost any tweak to the code, and
>> aren't necessarily cause for major concern.
>
> I agree with that in general, but the concern is a lot bigger when the
> function is something that is called everywhere and accounts for a
> measurable percentage of our total CPU usage on almost any workload.
> If memory allocation got slower because, say, you added some code to
> regexp.c and it caused AllocSetAlloc to split a cache line where it
> hadn't previously, I wouldn't be worried about that; the next patch,
> like as not, will buy the cost back again. But here you really are
> adding code to a hot path.

I think Jeff was suggesting that we should ignore changes measurably
affecting performance - I'm one of those he discussed this patch with at
pgcon, and I can assure you impact on performance was one of the main
topics of the discussion.

Firstly, do we really have good benchmarks and measurements? I really
doubt that. We do have some numbers for REINDEX, where we observed
0.5-1% regression on noisy results from a Power machine (and we've been
unable to reproduce that on x86). I don't think that's a representative
benchmark, and I'd like to see more thorough measurements. And I agreed
to do this, once Jeff comes up with a new version of the patch.

Secondly, the question is whether the performance is impacted more by
the additional instructions, or by other things - say, random padding,
as was explained by Andrew Gierth in [1].

I don't know whether that's happening in this patch, but if it is, it
seems rather strange to use this against this patch and not the others
(because there surely will be other patches causing similar issues).

[1]
http://www.postgresql.org/message-id/87vbitb2zp.fsf@news-spur.riddles.org.uk

> tuplesort.c does its own accounting, and TBH that seems like the right
> thing to do here, too. The difficulty is, I think, that some
> transition functions use an internal data type for the transition
> state, which might not be a single palloc'd chunk. But since we can't
> spill those aggregates to disk *anyway*, that doesn't really matter.
> If the transition is a varlena or a fixed-length type, we can know how
> much space it's consuming without hooking into the memory context
> framework.

I respectfully disagree. Our current inability to dump/load the state
has little to do with how we measure consumed memory, IMHO.

It's true that we do have two kinds of aggregates, depending on the
nature of the aggregate state:

(a) fixed-size state (data types passed by value, variable length types
that do not grow once allocated, ...)

(b) continuously growing state (as in string_agg/array_agg)

Jeff's HashAgg patch already fixes (a) and can fix (b) once we get a
solution for dump/load of the aggregate stats - which we need to
implement anyway for parallel aggregate.

I know there was a proposal to force all aggregates to use regular data
types as aggregate stats, but I can't see how that could work without a
significant performance penalty. For example array_agg() is using
internal to pass ArrayBuildState - how do you turn that to a regular
data type without effectively serializing/deserializing the whole array
on every transition?

And even if we come up with a solution for array_agg, do we really
believe it's possible to do for all custom aggregates? Maybe I'm missing
something but I doubt that. ISTM designing ephemeral data structure
allows tweaks that are impossible otherwise.

What might be possible is extending the aggregate API with another
custom function returning size of the aggregate state. So when defining
an aggregate using 'internal' for aggregate state, you'd specify
transfunc, finalfunc and sizefunc. That seems doable, I guess.

I find the memory accounting as a way more elegant solution, though.

kind regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kouhei Kaigai 2015-07-15 01:35:46 Re: security labels on databases are bad for dump & restore
Previous Message Kouhei Kaigai 2015-07-15 01:12:04 Re: ctidscan as an example of custom-scan (Re: [v9.5] Custom Plan API)