RE: Copy data to DSA area

From: "Ideriha, Takeshi" <ideriha(dot)takeshi(at)jp(dot)fujitsu(dot)com>
To: 'Thomas Munro' <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: RE: Copy data to DSA area
Date: 2018-11-09 01:19:25
Message-ID: 4E72940DA2BF16479384A86D54D0988A6F1EFEC3@G01JPEXMBKW04
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi, thank you for all the comment.
It's really helpful.

>From: Thomas Munro [mailto:thomas(dot)munro(at)enterprisedb(dot)com]
>Sent: Wednesday, November 7, 2018 1:35 PM
>
>On Wed, Nov 7, 2018 at 3:34 PM Ideriha, Takeshi <ideriha(dot)takeshi(at)jp(dot)fujitsu(dot)com>
>wrote:
>> Related to my development (putting relcache and catcache onto shared
>> memory)[1],
>>
>> I have some questions about how to copy variables into shared memory, especially
>DSA area, and its implementation way.
>>
>> Under the current architecture when initializing some data, we
>> sometimes copy certain data using some specified functions
>>
>> such as CreateTupleDescCopyConstr(), datumCopy(), pstrdup() and so on.
>> These copy functions calls palloc() and allocates the
>>
>> copied data into current memory context.
>
>Yeah, I faced this problem in typcache.c, and you can see the function
>share_typledesc() which copies TupleDesc objects into a DSA area.
>This doesn't really address the fundamental problem here though... see below.

I checked share_tupledesc(). My original motivation came from copying tupledesc
with constraint like CreateTupleDescCopyConstr().
But yes, as you stated here and bellow without having to copy
tupledesc constraint, this method makes sense.

>
>> But on the shared memory, palloc() cannot be used. Now I'm trying to
>> use DSA (and dshash) to handle data on the shared memory
>>
>> so for example dsa_allocate() is needed instead of palloc(). I hit upon three ways to
>implementation.
>>
>> A. Copy existing functions and write equivalent DSA version copy
>> functions like CreateDSATupleDescCopyConstr(),
>>
>> datumCopyDSA(), dsa_strdup()
>>
>> In these functions the logic is same as current one but would be replaced palloc()
>with dsa_allocate().
>>
>> But this way looks too straight forward and code becomes less readable and
>maintainable.
>>
>> B. Using current functions and copy data on local memory context temporarily and
>copy it again to DSA area.
>>
>> This method looks better compared to the plan A because we don't need to write
>clone functions with copy-paste.
>>
>> But copying twice would degrade the performance.
>
>It's nice when you can construct objects directly at an address supplied by the caller.
>In other words, if allocation and object initialization are two separate steps, you can
>put the object anywhere you like without copying. That could be on the stack, in an
>array, inside another object, in regular heap memory, in traditional shared memory, in
>a DSM segment or in DSA memory. I asked for an alloc/init separation in the Bloom
>filter code for the same reason. But this still isn't the real problem here...

Yes, actually I tried to create a new function TupleDescCopyConstr() which is almost same
as TupleDescCopy() except also copying constraints. This is supposed to separate allocation
and initialization. But as you pointed out bellow, I had to manage object graph with pointes
and faced the difficulty.

>> C. Enhance the feature of palloc() and MemoryContext.
>>
>> This is a rough idea but, for instance, make a new global flag to tell palloc() to
>use DSA area instead of normal MemoryContext.
>>
>> MemoryContextSwitchToDSA(dsa_area *area) indicates following palloc() to
>allocate memory to DSA.
>>
>> And MemoryContextSwitchBack(dsa_area) indicates to palloc is used as normal
>one.
>>
>> MemoryContextSwitchToDSA(dsa_area);
>>
>> palloc(size);
>>
>> MemoryContextSwitchBack(dsa_area);
>>
>> Plan C seems a handy way for DSA user because people can take advantage of
>existing functions.
>
>The problem with plan C is that palloc() has to return a native pointer, but in order to
>do anything useful with this memory (ie to share it) you also need to get the
>dsa_pointer, but the palloc() interface doesn't allow for that. Any object that holds a
>pointer allocated with DSA-hiding-behind-palloc() will be useless for another process.

Agreed. I didn't have much consideration on this point.

>> What do you think about these ideas?
>
>The real problem is object graphs with pointers. I solved that problem for TupleDesc
>by making them *almost* entirely flat, in commit c6293249. I say 'almost' because it
>won't work for constraints or defaults, but that didn't matter for the typcache.c case
>because it doesn't use those. In other words I danced carefully around the edge of
>the problem.
>
>In theory, object graphs, trees, lists etc could be implemented in a way that allows for
>"flat" storage, if they can be allocated in contiguous memory and refer to sub-objects
>within that space using offsets from the start of the space, and then they could be used
>without having to know whether they are in DSM/DSA memory or regular memory.
>That seems like a huge project though. Otherwise they have to hold dsa_pointer, and
>deal with that in many places. You can see this in the Parallel Hash code. I had to
>make the hash table data structure able to deal with raw pointers OR dsa_pointer.
>That's would be theoretically doable, but really quite painful, for the whole universe of
>PostgreSQL node types and data structures.

Yeah, thank you for summarizing this point. That's one of the difficulty I've faced with
but failed to state it in my previous email.
I agreed your point.
Making everything "flat" is not a realistic choice and holding dsa_pointer here and there
and switching native pointer or dsa_pointer using union type or other things can be done but still huge one.

>I know of 3 ideas that would make your idea C work, so that you could share
>something as complicated as a query plan directly without having to deserialise it to
>use it:
>
>1. Allow the creation of DSA areas inside the traditional fixed memory segment
>(instead of DSM), in a fixed-sized space reserved by the postmaster. That is, use
>dsa.c's ability to allocate and free memory, and possibly free a whole area at once to
>avoid leaking memory in some cases (like MemoryContexts), but in this mode
>dsa_pointer would be directly castable to a raw pointer. Then you could provide a
>regular MemoryContext interface for it, and use it via palloc(), as you said, and all the
>code that knows how to construct lists and trees and plan nodes etc would All Just
>Work. It would be your plan C, and all the pointers would be usable in every process,
>but limited in total size at start-up time.
>
>2. Allow regular DSA in DSM to use raw pointers into DSM segments, by mapping
>segments at the same address in every backend. This involves reserving a large
>virtual address range up front in the postmaster, and then managing the space,
>trapping SEGV to map/unmap segments into parts of that address space as necessary
>(instead of doing that in dsa_get_address()). AFAIK that would work, but it seems to
>be a bit weird to go to such lengths. It would be a kind of home-made simulation of
>threads. On the other hand, that is what we're already doing in dsa.c, except more
>slowly due to extra software address translation from funky pseudo-addresses.
>
>3. Something something threads.

I'm thinking to go with plan 1. No need to think about address translation
seems tempting. Plan 2 (as well as plan 3) looks a big project.

Regards,
Takeshi Ideriha

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Edmund Horner 2018-11-09 01:23:42 Re: Cache relation sizes?
Previous Message Thomas Munro 2018-11-09 01:16:02 Strange corruption in psql output on mereswine