PATCH: postpone building buckets to the end of Hash (in HashJoin)

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: PATCH: postpone building buckets to the end of Hash (in HashJoin)
Date: 2015-12-14 20:04:24
Message-ID: 566F20C8.8040808@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

attached is v1 of one of the hashjoin improvements mentioned in
September in the lengthy thread [1].

The main objection against simply removing the MaxAllocSize check (and
switching to MemoryContextAllocHuge) is that if the number of rows is
overestimated, we may consume significantly more memory than necessary.

We run into this issue because we allocate the buckets at the very
beginning, based on the estimate. I've noticed we don't really need to
do that - we don't really use the buckets until after the Hash node
completes, and we don't even use it when incrementing the number of
batches (we use the dense allocation for that).

So this patch removes this - it postpones allocating the buckets to the
end of MultiExecHash(), and at that point we know pretty well what is
the optimal number of buckets.

This makes tracking nbuckets_optimal/log2_nbuckets_optimal unnecessary,
as we can simply use nbuckets/log2_nbuckets for that purpose. I've also
removed nbuckets_original, but maybe that's not a good idea and we want
to keep that information (OTOH we never use that number of buckets).

This patch does not change the estimation in ExecChooseHashTableSize()
at all, because we still need to do that to get nbucket/nbatch. Maybe
this is another opportunity for improvement in case of overestimates,
because in that case it may happen that we do batching even when we
could do without it. So we might start with nbuckets=1024 and
nbatches=1, and only switch to the estimated number of batches if really
needed.

This patch also does not mess with the allocation, i.e. it still uses
the MaxAllocSize limit (which amounts to ~256MB due to the doubling,
IIRC), but it should make it easier to do that change.

[1] http://www.postgresql.org/message-id/flat/19746(dot)1443983463(at)sss(dot)pgh(dot)pa(dot)us

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
hash-delay-buckets-v1.patch text/x-diff 9.4 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Gurjeet Singh 2015-12-14 20:27:47 Re: Function and view to retrieve WAL receiver status
Previous Message Alvaro Herrera 2015-12-14 20:00:19 Re: fix for readline terminal size problems when window is resized with open pager