Re: sb_alloc: a new memory allocator for PostgreSQL

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: sb_alloc: a new memory allocator for PostgreSQL
Date: 2014-05-06 14:40:18
Message-ID: CA+U5nMKpT+j6_2aXCz-b2AxJS+6sn4uU17Le=v-f06ZzsTBnEg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 6 May 2014 14:49, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Tue, May 6, 2014 at 9:31 AM, Heikki Linnakangas
> <hlinnakangas(at)vmware(dot)com> wrote:
>> As a generic remark, I wish that whatever parallel algorithms we will use
>> won't need a lot of ad hoc memory allocations from shared memory. Even
>> though we have dynamic shared memory now, complex data structures with a lot
>> of pointers and different allocations are more painful to debug, tune, and
>> make concurrency-safe. But I have no idea what exactly you have in mind, so
>> I'll just have to take your word on it that this is sensible.
>
> Yeah, I agree. Actually, I'm hoping that a lot of what we want to do
> can be done using the shm_mq stuff, which uses a messaging paradigm.
> If the queue is full, you wait for the consumer to read some data
> before writing more. That is much simpler and avoids a lot of
> problems.
>
> There are several problems with using pointers in dynamic shared
> memory. The ones I'm most concerned about are:
>
> 1. Segments are relocatable, so you can't actually use absolute
> pointers. Maybe someday we'll have a facility for dynamic shared
> memory segments that are mapped at the same address in every process,
> or maybe not, but right now we sure don't.

Sounds like a problem for static allocations, not dynamic ones.

It makes a lot of sense to use dynamic shared memory for sorts
especially, since you can just share the base pointer and other info
and a "blind worker" can then do the sort for you without needing
transactions, snapshots etc..

I'd also like to consider putting common reference tables as hash
tables into shmem.

> 2. You've got to decide up-front how much memory to set aside for
> dynamic allocation, and you can't easily change your mind later. Some
> of the DSM implementations support growing the segment, but you've got
> to somehow get everyone who is using it to remap it, possibly at a
> different address, so it's a long way from being transparent.

Again, depends on the algorithm. If we program the sort to work in
fixed size chunks, we can then use a merge sort at end to link the
chunks together. So we just use an array of fixed size chunks. We
might need to dynamically add more chunks, but nobody needs to remap.

Doing it that way means we do *not* need to change situation if it
becomes an external sort. We just mix shmem and external files, all
merged together at the end.

We need to take account of the amount of memory locally available per
CPU, so there is a maximum size for these things. Not sure what tho'

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2014-05-06 14:59:43 Re: New pg_lsn type doesn't have hash/btree opclasses
Previous Message Tom Lane 2014-05-06 14:18:39 Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers