Automatically setting work_mem

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Automatically setting work_mem
Date: 2006-03-16 15:34:56
Message-ID: 1142523296.3859.412.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

One of the key points influencing performance of certain operations is
the amount of memory that is available to them. Sorts, Materialize, Hash
Joins and Aggs and hashed subquery plans all want lots of memory.

Static allocation of memory is good in some situations, but not in
others. In many cases, sysadmins want to be able to tell the server how
much memory it can use and then have the server work out how to allocate
that according to the work already executing. (Whatever we do, the
static allocation of memory via work_mem should remain, so I suggest
only additional options rather than change of the existing mechanisms.)

My goal is a simple and effective way of increasing performance without
needing to sweat over particular settings for individual backends, all
of which need to be individually reconsidered when we upgrade memory.
Small additional amounts of memory can make huge differences to elapsed
times; we need a flexible way to use more when it makes sense to do so.

I envisage a new setting, shared_work_mem, that would allow a sysadmin
to define a secondary pool of memory from which backends could dip into
once they run out of their primary allocation (work_mem).
shared_work_mem=0 would provide the current behaviour.

shared_work_mem would *not* be allocated until time of use; it is a
abstract concept only. As explained below it would not be shared memory
at all, but privately allocated process memory.

We would only look at dynamically changing work_mem in those few
restricted cases where we track that against the work_mem limit. If we
hit that limit, we would make a request to the central pool: "Can I be
allotted another 2MB please?" (etc). The central allotment mechanism
would then say Yes or No. If allotted the memory, the backend would then
palloc up to that limit. The backend may return later for additional
allotments, but for now it has been allowed to dynamically increase its
memory usage. This allotment would be noted in the memory context
header, so that when the memory context is freed, the allotment can be
"returned" to the central pool by a deallotment call. This is now easier
than before since each sort within a query has its own memory context.

(I use the term "allot" to differentiate it from the term "allocate"
which describes the execution of malloc etc. First the server would
conceptually allot memory, then we would physically allocate it. Once
allocated, memory would not be deallocated any earlier than normal.
Memory would never be deallotted until the memory is deallocated.)

shared_work_mem would be a SUSET parameter, allowing it to be changed
up/down while server running. All of the details around this would be
built into a new API for re/palloc calls aimed at larger requests.

Some thorny points are:
1. what is the allotment policy?
2. what do we do when the memory pool has all been used up?
3. do we make the allocation at planning time? - allowing us to
potentially use a different plan because we know we will have the memory
to use when we get to the executor.

My current thoughts are:
(1) allow different allotment policies, possibly even using an external
function call, but basically giving everybody the flexibility they want.
For now, we can provide a simple mechanism for starters, then add more
later
e.g. shared_work_mem_policy = 'firstcomefirstserved(10)'
I don't want to introduce too many new parameters here...

(2) Various options here:
a) query queues for allocation
b) query throws ERROR OOM (unlikely to be a popular one)
c) query gets nothing (good alloc scheme can prevent harshness here...)
Simplicity says c), since a) is lots of work and b) not useful

(3) lets do this simply to start with - allocation only occurs in
executor, so is tied neatly into the executor memory contexts.

Another idea is to introduce transactional statement queuing, but that
may be a sledgehammer to crack a fairly simple nut.

There are some additional technical points which need not be discussed
yet, though which would need to be addressed also.

Best Regards, Simon Riggs

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Martijn van Oosterhout 2006-03-16 15:35:58 Re: [HACKERS] Bug report form: locale/encoding
Previous Message Albe Laurenz 2006-03-16 15:25:30 Re: problems compiling CVS HEAD - LDAP auth and Kerberos

Browse pgsql-patches by date

  From Date Subject
Next Message Pavel Stehule 2006-03-16 17:32:12 allow select from void function in void sql function
Previous Message Jonah H. Harris 2006-03-15 14:05:17 Re: [PATCHES] About the structure of WAL Files.