Re: WIP: [[Parallel] Shared] Hash

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: [[Parallel] Shared] Hash
Date: 2017-02-08 13:03:57
Message-ID: CAFjFpRfCgnx--mCvBewqq05ZkC4f56YdvYMoXFzOFxjh2kU1pg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
> 0004-hj-refactor-batch-increases-v4.patch:
>
> Modify the existing hash join code to detect work_mem exhaustion at
> the point where chunks are allocated, instead of checking after every
> tuple insertion. This matches the logic used for estimating, and more
> importantly allows for some parallelism in later patches.

The patch has three changes
1. change dense_alloc() to accept respect_workmem argument and use it
within the function.
2. Move call to ExecHashIncreaseNumBatches() into dense_alloc() from
ExecHashTableInsert() to account for memory before inserting new tuple
3. Check growEnabled before calling ExecHashIncreaseNumBatches().

I think checking growEnabled within ExecHashIncreaseNumBatches() is
more easy to maintain that checking at every caller. If someone is to
add a caller tomorrow, s/he has to remember to add the check.

It might be better to add some comments in
ExecHashRemoveNextSkewBucket() explaining why dense_alloc() should be
called with respect_work_mem = false? ExecHashSkewTableInsert() does
call ExecHashIncreaseNumBatches() after calling
ExecHashRemoveNextSkewBucket() multiple times, so it looks like we do
expect increase in space used and thus go beyond work_mem for a short
while. Is there a way we can handle this case in dense_alloc()?

Is it possible that increasing the number of batches changes the
bucket number of the tuple being inserted? If so, should we
recalculate the bucket and batch of the tuple being inserted?

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-02-08 13:25:34 Re: Parallel bitmap heap scan
Previous Message Pavel Raiskup 2017-02-08 12:31:01 Re: [PATCH] configure-time knob to set default ssl ciphers