Re: Limit allocated memory per session

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Euler Taveira de Oliveira <euler(at)timbira(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, daveg <daveg(at)sonic(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Limit allocated memory per session
Date: 2009-10-02 01:18:20
Message-ID: 603c8f070910011818w68721563nb46b2994c94ecaec@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Oct 1, 2009 at 12:15 PM, Euler Taveira de Oliveira
<euler(at)timbira(dot)com> wrote:
> Robert Haas escreveu:
>> On Thu, Oct 1, 2009 at 11:47 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> Euler Taveira de Oliveira <euler(at)timbira(dot)com> writes:
>>>> Tom Lane escreveu:
>>>>> daveg <daveg(at)sonic(dot)net> writes:
>>>>>> I'd like to propose adding a new GUC to limit the amount of memory a backend
>>>>>> can allocate for its own use.
>>>>> Use ulimit.
>>>>>
>>>> What about plataforms (Windows) that don't have ulimit?
>>> Get a real operating system ;-)
>>>
>>> Seriously, the proposed patch introduces overhead into a place that is
>>> already a known hot spot, in return for not much of anything.  It will
>>> *not* bound backend memory use very accurately, because there is no way
>>> to track raw malloc() calls.  And I think that 99% of users will not
>>> find it useful.
>>
>> What WOULD be useful is to find a way to provide a way to configure
>> work_mem per backend rather than per executor node.  But that's a much
>> harder problem.
>>
> I see. Tough problem is: how do we get per backend memory usage accurately? Is
> it relying on OS specific API the only way?

As I see it, this is really a planning problem, not an executor
problem, so measuring ACTUAL memory usage is not really important: the
problem is taking memory usage into account during planning. The
difficulty with adjusting work_mem right now is that the correct value
depends not only on the number of queries that are concurrently
executing (which isn't a constant) but also on the number of sort/hash
operations being performed per query (which is also not a constant).
So if your queries become more complex, a value of work_mem that was
previously OK may start to cause swapping, which encourages setting
work_mem conservatively. But setting it conservatively can cause the
planner to pick plans that save memory at a LARGE performance cost.

Fixing this isn't simple. Right now, when planning a particular
joinrel, we only keep track of the best plans for each possible set of
path keys, regardless of how much or little memory they use. So if we
do something naive, like just track the total amount of memory that
each candidate path is forecast to use and avoid letting it go above
some ceiling, query planning might fail altogether, because the
lower-level joinrels use as much memory as they want and the higher
level nodes, which for some reason can't be done without memory, can't
be planned. Or we might just end up with a badly suboptimal plan,
because we pick a slightly cheaper plan lower down in the tree that
uses a LOT more memory over a slightly more expensive one that uses
much less. Later we'll wish we hadn't, but by that point it's too
late.

Another possible angle of attack is to try to give the planner a range
for work_mem rather than a hard limit. The planner would ordinarily
construct paths as though the lower end of the range was the limit,
but for a sufficiently large cost savings it would be willing to adopt
a path that used more memory. Potentially this willingness could also
be conditioned on the amount of memory used by the path so far,
although that has the same problems described above in kind if not in
degree. I'm not really sure whether something like this can be made
to work; I'm not sure there's really enough information available when
constructing paths for any sort of local decision-making to prove
fruitful.

The other idea I have is to adopt a strategy where each plan node has
upper and lower bounds on cost, as I previously suggested here with
respect to index-only scans.

http://archives.postgresql.org/pgsql-hackers/2009-09/msg01379.php

The idea would basically be to estimate the lower-bound for the cost
of a sort based on the idea that we'll have the maximum possible
amount of memory to work with (say, the budget for the whole query)
and the upper-bound cost based on the idea that we'll have the minimum
possible amount of memory (zero, or whatever the minimal amount is).
We can also estimate the most memory we think we can usefully use (for
example, a hash join with a smaller inner rel doesn't benefit from
more memory than the amount required to hold the entire hash table in
memory).

After we complete the first round of planning, we look at the
resulting paths and decide which sorts or hashes will get funded with
how much memory. I'm hand-waving a little bit here, because there may
be a knapsack problem in here (which is NP-complete), since the cost
as a function of memory probably has sharp cliffs with not much change
in between them - certainly for hashing, and I suspect for sorting as
well, but it might be that in practice N is small enough not to
matter, or we might be able to find an approximation that is good
enough that we can live with it. Even if we can get past that hurdle,
though, there's still all the caveats from the original email,
principally that it's unclear that the necessary computations can be
done without blowing planning time out of the water. Plus, if we used
this strategy for multiple purposes, like position of heap fetch nodes
and also allocation of work memory, there could be interdependencies
that would turn the whole thing into a giant mess.

So to reiterate my first comment: a MUCH harder problem.

...Robert

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message KaiGai Kohei 2009-10-02 01:19:18 Re: CREATE OR REPLACE FUNCTION vs ownership
Previous Message Robert Haas 2009-10-02 00:57:18 Re: CREATE OR REPLACE FUNCTION vs ownership