Re: same-address mappings vs. relative pointers

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: same-address mappings vs. relative pointers
Date: 2013-12-05 14:44:34
Message-ID: 20131205144434.GG12398@alap2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2013-12-05 07:44:27 -0500, Robert Haas wrote:
> On Thu, Dec 5, 2013 at 4:56 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> > Hi Robert,
> >
> > On 2013-12-04 23:32:27 -0500, Robert Haas wrote:
> >> But I'm also learning painfully that this kind of thing only goes so
> >> far. For example, I spent some time looking at what it would take to
> >> provide a dynamic shared memory equivalent of palloc/pfree, a facility
> >> that I feel fairly sure would attract a few fans.
> >
> > I have to admit, I don't fully see the point in a real palloc()
> > equivalent for shared memory. If you're allocating shared memory in a
> > granular fashion at somewhat high frequency you're doing something
> > *wrong*. And the result is not going to be scalable.

> Why? Lots of people have written lots of programs that do just that.

Well, but we're a database, not a generic programming library ;)

> I agree it's overdone, but saying it should never happen seems like an
> over-rotation in the opposite direction.

That might be true.

> But now let's suppose the input data was estimated to fit in work_mem
> but in the end did not, and therefore we need to instead do an
> external sort. That means we're going to start evicting tuples from
> memory and to make room for new tuples we read from the input stream.
> Well, now our strategy of tight-packing everything does not look so
> good, because we have no way of tracking which tuples we no longer
> need and reusing that space for new tuples. If external sort is not
> something we know how to do in parallel, we can potentially copy all
> of the tuples out of the dynamic shared memory segment and back to
> backend-private memory in palloc'd chunks, and then deallocate the
> dynamic shared memory segment... but that's potentially expensive if
> work_mem is large, and it temporarily uses double what we're allowed
> by work_mem.

But what's your alternative if you have a shared_palloc() like thingy?
The overhead is going to be crazy if you allocate so granular that you
can easily grow, and if you allocate on a coarse level, this isn't going
to buy you much?

Essentially, I am not arguing against a facility to dynamically allocate
memory from shared memory, but I think the tradeoffs are different
enough that I think palloc()/mcxt.c isn't necessarily the thing to model
it after. But then, I don't really have an idea how it should look like,
not having thought about it much.

> And suppose we want to do a parallel external sort with, say, the
> worker backend pushing tuples into a heap stored in the dynamic shared
> memory segment and other backends writing them out to tapes and
> removing them from the heap. You can't do that without some kind of
> space management, i.e. an allocator.

Hm. I'd have thought one would implement that by pushing fixed size
amounts of tuples to individual workers, let them sort each chunk in
memory and then write the result out to disk.

> >> Any thoughts on what the least painful compromise is here?
> >
> > I think a reasonable route is having some kind of smart-pointer on C
> > level that abstracts away the offset math and allows to use pointers
> > locally. Something like void *sptr_deref(sptr *); where the returned
> > pointer can be used as long as it is purely in memory. And
> > sptr_update(sptr *, void *); which allows a sptr to point elsewhere in
> > the same segment.
> > + lots of smarts
>
> Can you elaborate on this a little bit? I'm not sure I understand
> what you're getting at here.

So, my thought is that we really want something that acts like a
pointer, just in shared memory, so we don't have to do too much offset
math. And I think for performance and readability we want to use
pointers when operating locally in a backend.

So, I guess there are situations in which we essentially want pointers:
a) pointing somewhere in the same segment.
b) pointing at an offset in *another* segment.

So, for a) what if we, instead of storing plain offsets have something like:
typedef struct sptr
{
Offset sptr_self;
Offset sptr_offset;
} sptr;

and always store that in dynamically allocated shared memory instead of
raw offsets.
That allows to define a function like:
void *
sptr_deref(sptr *sp)
{
return ((char *)sp) - sp->sptr_self + sp->sptr_offset;
}
to get what we point to. Without having to reference the segment base,
at the price of having to store two offsets.
To update the pointer we could have:
void
sptr_set(sptr *sp, void *p)
{
char *segment_base;

segment_base = ((char *)sp) - sp->sptr_self;

/* make sure pointer doesn't point below the beginning of the segment */
Assert(segment_base <= p);

/* make sure pointer doesn't point beyond the segment */
Assert(p < segment_base + GetSegmentSize(segment_base));

sp->sptr_offset = ((char *)p) - segment_base;
}
To initialize we'd have:
void
sptr_init(dsm_segment *segment, sptr *sp, void *p)
{
sp->sptr_self = ((char *)sp) - (char *)(segment->mapped_address)
sptr_set(sp, p);
}

for b) we could have
typedef struct sptr_far
{
dsm_handle far_handle;
sptr far_ptr;
}
and equivalent sptr_far_deref/set/init. Although that probably would
require a more efficient manner to resolve a dsm_handle to the dsm_segment.

Makes at least some sense?

> > I continue to believe that a) using pointers in dynamically allocated
> > segments is going to end up with lots of pain. b) the pain from not
> > having real pointers is manageable.
>
> Fair opinion, but I think we will certainly need to pass around memory
> offsets in some form for certain things.

Absolutely - I am not sure where the "but" is coming from ;)

> And then I thought, boy, it sucks
> not to be able to declare what kind of a thing we're pointing *at*
> here, but apart from using C++ I see no solution to that problem. I
> guess we could do something like this:
>
> #define relptr(type) Size
>
> So the compiler wouldn't enforce anything, but at least notationally
> we'd know what sort of object we were supposedly referencing.

There might be some ugly compiler dependent magic we could do. Depending
on how we decide to declare offsets. Like (very, very roughly)

#define relptr(type, struct_name, varname) union struct_name##_##varname{ \
type relptr_type; \
Offset relptr_off;
}

And then, for accessing have:
#define relptr_access(seg, off) \
typeof(off.relptr_type)* (((char *)seg->base_address) + off.relptr_off)

But boy, that's ugly.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2013-12-05 14:49:42 Re: same-address mappings vs. relative pointers
Previous Message Peter Eisentraut 2013-12-05 14:43:31 Re: Feature request: Logging SSL connections