Re: Treating work_mem as a shared resource (Was: Parallel Hash take II)

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>, Prabhat Sahu <prabhat(dot)sahu(at)enterprisedb(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Oleg Golovanov <rentech(at)mail(dot)ru>
Subject: Re: Treating work_mem as a shared resource (Was: Parallel Hash take II)
Date: 2017-11-16 04:48:01
Message-ID: CAKJS1f_ny8G_CM6kNiSEPWfBvv6sRBr9+NNP=oGfw1ids0m9TQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 16 November 2017 at 16:38, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> * To understand how this relates to admission control. The only
> obvious difference that I can think of is that admission control
> probably involves queuing when very memory constrained, and isn't
> limited to caring about memory. I'm not trying to make swapping/OOM
> impossible here; I'm trying to make it easier to be a Postgres DBA
> sizing work_mem, and make it so that DBAs don't have to be stingy with
> work_mem. The work_mem sizing formulas we sometimes promote (based on
> max_connections) are probably very limiting in the real world.

I had always imagined that this should be some sort of work_mem_pool.
Each plan would have some mention of how much they expect to consume,
which I'd thought was N * work_mem where N is the number of Nodes in
the plan that require a work_mem, then at the start of execution, we
atomically increment variable in shared mem that tracks the
work_mem_pool usage, then check if that variable is <= work_mem_pool
then start execution, if not we add ourselves to some waiters queue
and go to sleep only to be signaled when another plan execution
completes and releases memory back into the pool, we'd then re-check
and just go back to sleep if there's still not enough space.

Probably simple plans with no work_mem requirement can skip all these
checks which may well keep concurrency up. I'm just not all that
clear on how to handle the case where the plan's memory estimate
exceeds work_mem_pool. It would never get to run. Perhaps everything
that requires any memory must wait in that case so this query can run
alone. i.e. special case this to require the work_mem_pool usage to be
0 before we run, or maybe it should just be an ERROR?

Probably the whole feature could be disabled if work_mem_pool is -1,
which might be a better option for users who find there's some kind of
contention around memory pool checks.

> I freely admit that my proposal is pretty hand-wavy at this point,
> but, as I said, I want to at least get the ball rolling.

Me too. I might have overlooked some giant roadblock.

I think it's important that the work_mem_pool tracker is consumed at
the start of the query rather than when the work_mem node first runs,
as there'd likely be some deadlocking type waiting issue if we have
plans part-way through execution start waiting for other plans to
complete. That might not be ideal, as we'd be assuming that a plan
will always consume all their work_mems at once, but it seems better
than what we have today. Maybe we can improve on it later.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2017-11-16 05:23:58 Re: Further simplification of c.h's #include section
Previous Message Robert Haas 2017-11-16 04:42:36 Re: [HACKERS] [POC] Faster processing at Gather node