Replace hashtable growEnable flag

From: Hubert Zhang <hzhang(at)pivotal(dot)io>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Replace hashtable growEnable flag
Date: 2019-05-15 10:19:38
Message-ID: CAB0yrekv=6_T_eUe2kOEvWUMwufcvfd15SFmCABtYFOkxCFdfA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi all,

When we build hash table for a hash join node etc., we split tuples into
different hash buckets. Since tuples could not all be held in memory.
Postgres splits each bucket into batches, only the current batch of bucket
is in memory while other batches are written to disk.

During ExecHashTableInsert(), if the memory cost exceeds the operator
allowed limit(hashtable->spaceAllowed), batches will be split on the fly by
calling ExecHashIncreaseNumBatches().

In past, if data is distributed unevenly, the split of batch may failed(All
the tuples falls into one split batch and the other batch is empty) Then
Postgres will set hashtable->growEnable to false. And never expand batch
number any more.

If tuples become diverse in future, spliting batch is still valuable and
could avoid the current batch become too big and finally OOM.

To fix this, we introduce a penalty on hashtable->spaceAllowed, which is
the threshold to determine whether to increase batch number.
If batch split failed, we increase the penalty instead of just turn off the
growEnable flag.

Any comments?

--
Thanks

Hubert Zhang

Attachment Content-Type Size
0001-Using-growPenalty-to-replace-growEnable-in-hashtable.patch application/octet-stream 4.1 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2019-05-15 10:28:30 Re: POC: Cleaning up orphaned files using undo logs
Previous Message Andrey Borodin 2019-05-15 10:06:22 Re: pglz performance