Re: Make MemoryContextMemAllocated() more precise

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Tomas Vondra <tomas(dot)vondra(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Make MemoryContextMemAllocated() more precise
Date: 2020-04-06 11:39:03
Message-ID: CAApHDvq4ATgVtkniK-pL0ZpdeVvDKvPp66ymHYtT=SVm8aON_w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, 28 Mar 2020 at 13:21, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> Attached refactoring patch. There's enough in here that warrants
> discussion that I don't think this makes sense for v13 and I'm adding
> it to the July commitfest.

I had a read over this too. I noted down the following during my pass of it.

1. The comment mentions "passthru", but you've removed that parameter.

* For now, the passthru pointer just points to "int level"; later we might
* make that more complicated.
*/
static void
-MemoryContextStatsPrint(MemoryContext context, void *passthru,
+MemoryContextStatsPrint(MemoryContext context, int level,
const char *stats_string)

2. I don't think MemoryContextCount is the best name for this
function. When I saw:

counters = MemoryContextCount(aggstate->hash_metacxt, flags, true);

i assumed it was returning some integer number, that is until I looked
at the "counters" datatype.

Maybe it would be better to name the function
MemoryContextGetTelemetry and the struct MemoryContextTelemetry rather
than MemoryContextCounters? Or maybe MemoryContextTally and call the
function either MemoryContextGetTelemetry() or
MemoryContextGetTally(). Or perhaps MemoryContextGetAccounting() and
the struct MemoryContextAccounting

3. I feel like it would be nicer if you didn't change the "count"
methods to return a MemoryContextCounters. It means you may need to
zero a struct for each level, assign the values, then add them to the
total. If you were just to zero the struct in MemoryContextCount()
then pass it into the count function, then you could just have it do
all the += work. It would reduce the code in MemoryContextCount() too.

4. Do you think it would be better to have two separate functions for
MemoryContextCount(), a recursive version and a non-recursive version.
I feel that the function should be so simple that it does not make a
great deal of sense to load it up to handle both cases. Looking
around mcxt.c, I see MemoryContextResetOnly() and
MemoryContextResetChildren(), while that's not a perfect example, it
does seem like a better lead to follow.

5. For performance testing, I tried using the following table with 1MB
work_mem then again with 1GB work_mem. I wondered if making the
accounting more complex would cause a slowdown in nodeAgg.c, as I see
we're calling this function each time we get a tuple that belongs in a
new group. The 1MB test is likely not such a great way to measure this
since we do spill to disk in that case and the changing in accounting
means we do that at a slightly different time, but the 1GB test does
not spill and it's a bit slower.

create table t (a int);
insert into t select x from generate_Series(1,10000000) x;
analyze t;

-- Unpatched

set work_mem = '1MB';
explain analyze select a from t group by a; -- Ran 3 times.

QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=481750.10..659875.95 rows=10000048 width=4)
(actual time=3350.190..8193.400 rows=10000000 loops=1)
Group Key: a
Planned Partitions: 32
Peak Memory Usage: 1177 kB
Disk Usage: 234920 kB
HashAgg Batches: 1188
-> Seq Scan on t (cost=0.00..144248.48 rows=10000048 width=4)
(actual time=0.013..1013.755 rows=10000000 loops=1)
Planning Time: 0.131 ms
Execution Time: 8586.420 ms
Execution Time: 8446.961 ms
Execution Time: 8449.492 ms
(9 rows)

-- Patched

set work_mem = '1MB';
explain analyze select a from t group by a;

QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=481748.00..659873.00 rows=10000000 width=4)
(actual time=3470.107..8598.836 rows=10000000 loops=1)
Group Key: a
Planned Partitions: 32
Peak Memory Usage: 1033 kB
Disk Usage: 234816 kB
HashAgg Batches: 1056
-> Seq Scan on t (cost=0.00..144248.00 rows=10000000 width=4)
(actual time=0.017..1091.820 rows=10000000 loops=1)
Planning Time: 0.285 ms
Execution Time: 8996.824 ms
Execution Time: 8781.624 ms
Execution Time: 8900.324 ms
(9 rows)

-- Unpatched

set work_mem = '1GB';
explain analyze select a from t group by a;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=169248.00..269248.00 rows=10000000 width=4)
(actual time=4537.779..7033.318 rows=10000000 loops=1)
Group Key: a
Peak Memory Usage: 868369 kB
-> Seq Scan on t (cost=0.00..144248.00 rows=10000000 width=4)
(actual time=0.018..820.136 rows=10000000 loops=1)
Planning Time: 0.054 ms
Execution Time: 7561.063 ms
Execution Time: 7573.985 ms
Execution Time: 7572.058 ms
(6 rows)

-- Patched

set work_mem = '1GB';
explain analyze select a from t group by a;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=169248.00..269248.00 rows=10000000 width=4)
(actual time=4840.045..7359.970 rows=10000000 loops=1)
Group Key: a
Peak Memory Usage: 861975 kB
-> Seq Scan on t (cost=0.00..144248.00 rows=10000000 width=4)
(actual time=0.018..789.975 rows=10000000 loops=1)
Planning Time: 0.055 ms
Execution Time: 7904.069 ms
Execution Time: 7913.692 ms
Execution Time: 7927.061 ms
(6 rows)

Perhaps the slowdown is unrelated. If it, then maybe the reduction in
branching mentioned by Andres might help a bit plus maybe a bit from
what I mentioned above about passing in the counter struct instead of
returning it at each level.

David

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2020-04-06 11:43:15 Re: [HACKERS] advanced partition matching algorithm for partition-wise join
Previous Message Michael Banck 2020-04-06 11:26:17 [patch] Fix pg_checksums to allow checking of offline base backup directories