Re: AllocSetEstimateChunkSpace()

From: Andres Freund <andres(at)anarazel(dot)de>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: AllocSetEstimateChunkSpace()
Date: 2020-03-25 22:09:36
Message-ID: 20200325220936.il3ni2fj2j2b45y5@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2020-03-25 14:43:43 -0700, Jeff Davis wrote:
> On Wed, 2020-03-25 at 12:42 -0700, Andres Freund wrote:
> > As mention on im/call: I think this is mainly an argument for
> > combining
> > the group tuple & per-group state allocations. I'm kind of fine with
> > afterwards taking the allocator overhead into account.
>
> The overhead comes from two places: (a) the chunk header; and (b) the
> round-up-to-nearest-power-of-two behavior.
>
> Combining the group tuple and per-group states only saves the overhead
> from (a); it does nothing to help (b), which is often bigger.

Hm? It very well can help with b), since the round-up only happens once
now? I guess you could argue that it's possible that afterwards we'd
more likely to end in a bigger size class, and thus have roughly the
same amount of waste due rounding? But I don't think that's all that
convincing.

I still, as I mentioned on the call, suspect that the right thing here
is to use an allocation strategy that suffers from neither a nor b (for
tuple and pergroup) and that has faster allocations too. That then also
would have the consequence that we don't need to care about per-alloc
overhead anymore (be it a or b).

> And it only saves that overhead when there *is* a per-group state
> (i.e. not for a DISTINCT query).

So?

> > I still don't buy that its useful to estimate the by-ref transition
> > value overhead. We just don't have anything even have close enough to
> > a
> > meaningful value to base this on.
>
> By-ref transition values aren't a primary motivation for me. I'm fine
> leaving that discussion separate if that's a sticking point. But if we
> do have a way to measure chunk overhead, I don't really see a reason
> not to use it for by-ref as well.

Well, my point is that it's pretty much pointless for by-ref types. The
size estimates, if they exist, are so inaccurate that we don't gain
anything by including it. As I said before, I think we'd be better off
initially assuming a higher transition space estimate.

> > Yea, the "context needs to exist" part sucks. I really don't want to
> > add
> > calls directly into AllocSet from more places though. And just
> > ignoring
> > the parameters to the context seems wrong too.
>
> So do you generally favor the EstimateMemoryChunkSpace() patch (that
> works for all context types)? Or do you have another suggestion? Or do
> you think we should just do nothing?

I think I'm increasingly leaning towards either using a constant
overhead factor, or just getting rid of all memory context
overhead. There's clearly no obviously correct design for the "chunk
size" functions, and not having overhead is better than ~correctly
estimating it.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-03-25 22:17:21 Re: pgsql: Provide a TLS init hook
Previous Message Tom Lane 2020-03-25 22:01:16 Re: pgsql: Provide a TLS init hook