Re: tweaking NTUP_PER_BUCKET

From: "Tomas Vondra" <tv(at)fuzzy(dot)cz>
To: "Heikki Linnakangas" <hlinnakangas(at)vmware(dot)com>
Cc: "Tomas Vondra" <tv(at)fuzzy(dot)cz>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: tweaking NTUP_PER_BUCKET
Date: 2014-08-20 13:20:03
Message-ID: 5d2f6f19e188b27fe460a11762270ae7.squirrel@sq.gransy.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 20 Srpen 2014, 14:05, Heikki Linnakangas wrote:
> On 07/20/2014 07:17 PM, Tomas Vondra wrote:
>> On 19.7.2014 20:24, Tomas Vondra wrote:
>>> On 13.7.2014 21:32, Tomas Vondra wrote:
>>>> The current patch only implemnents this for tuples in the main
>>>> hash table, not for skew buckets. I plan to do that, but it will
>>>> require separate chunks for each skew bucket (so we can remove it
>>>> without messing with all of them). The chunks for skew buckets
>>>> should be smaller (I'm thinking about ~4kB), because there'll be
>>>> more of them, but OTOH those buckets are for frequent values so the
>>>> chunks should not remain empty.
>>>
>>> I've looked into extending the dense allocation to the skew buckets,
>>> and I think we shouldn't do that. I got about 50% of the changes and
>>> then just threw it out because it turned out quite pointless.
>>>
>>> The amount of memory for skew buckets is limited to 2% of work mem,
>>> so even with 100% overhead it'll use ~4% of work mem. So there's
>>> pretty much nothing to gain here. So the additional complexity
>>> introduced by the dense allocation seems pretty pointless.
>>>
>>> I'm not entirely happy with the code allocating some memory densely
>>> and some using traditional palloc, but it certainly seems cleaner
>>> than the code I had.
>>>
>>> So I think the patch is mostly OK as is.
>>
>> Attached is v4 of the patch, mostly with minor improvements and cleanups
>> (removed MemoryContextStats, code related to skew buckets).
>
> Can you remind me what kind of queries this is supposed to help with?
> Could you do some performance testing to demonstrate the effect? And
> also a worst-case scenario.

The dense allocation? That was not really directed at any specific
queries, it was about lowering hashjoin memory requirements in general.

First to make it more correct with respect to work_mem (hashjoin does not
account for the palloc overhead, so the actual memory consumption might be
much higher than work_mem).

Second to compensate for the memory for more buckets due to
NTUP_PER_BUCKET, which is tweaked in the 'hashjoin dynamic nbuckets'
patch.

There are some numbers / more detailed analysis in
http://www.postgresql.org/message-id/a17d6217fe0c9e459cb45cb764ad727c.squirrel@sq.gransy.com

>
> At a quick glance, I think you're missing a fairly obvious trick in
> ExecHashIncreaseNumBatches: if a chunk contains only a single tuple, you
> can avoid copying it to a new chunk and just link the old chunk to the
> new list instead. Not sure if ExecHashIncreaseNumBatches is called often
> enough for that to be noticeable, but at least it should help in extreme
> cases where you push around huge > 100MB tuples.

Yeah, I thought about that too, but it seemed like a rare corner case. But
maybe you're right - it will also eliminate the 'fluctuation' (allocating
100MB chunk, which might get us over work_mem, ...). Not sure if this is
going to help, but it's easy enough to fix I guess.

The other optimization I'm thinking about is that when increasing number
of batches, the expectation is only about 1/2 the chunks will be
necessary. So instead of freeing the chunk, we might keep it and reuse it
later. That might lower the palloc overhead a bit (but it's already very
low, so I don't think that's measurable difference).

regards
Tomas

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2014-08-20 13:27:36 Re: [patch] pg_copy - a command for reliable WAL archiving
Previous Message Greg Stark 2014-08-20 13:02:44 Re: [patch] pg_copy - a command for reliable WAL archiving