RE: Copy data to DSA area

From: "Ideriha, Takeshi" <ideriha(dot)takeshi(at)jp(dot)fujitsu(dot)com>
To: 'Kyotaro HORIGUCHI' <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, "thomas(dot)munro(at)enterprisedb(dot)com" <thomas(dot)munro(at)enterprisedb(dot)com>, "robertmhaas(at)gmail(dot)com" <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: RE: Copy data to DSA area
Date: 2019-04-17 05:07:11
Message-ID: 4E72940DA2BF16479384A86D54D0988A7DB2B7CD@G01JPEXMBKW04
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>From: Ideriha, Takeshi [mailto:ideriha(dot)takeshi(at)jp(dot)fujitsu(dot)com]
>Sent: Wednesday, December 5, 2018 2:42 PM
>Subject: RE: Copy data to DSA area

Hi
It's been a long while since we discussed this topic.
Let me recap first and I'll give some thoughts.

It seems things we got consensus is:

- Want to palloc/pfree transparently in DSA
- Use Postgres-initialized shared memory as DSA
- Don’t leak memory in shared memory

Things under discussion:
- How we prevent memory leak
- How we prevent dangling pointer after cleaning up about-to-leak-objects

Regarding memory leak, I think Robert's idea that allocate objects under temporal context
while building and re-parent it to permanent one at some point is promising.
While building objects they are under temporal DSA-MemoryContext, which is
child of TopTransactionContext (if it's in the transaction) and are freed all at once when error happens.
To do delete all the chunks allocated under temporal DSA context, we need to search
or remember all chunks location under the context. Unlike AllocAset we don't have block information
to delete them altogether.

So I'm thinking to manage dsa_allocated chunks with single linked list to keep track of them and delete them.
The context has head of linked list and all chunks have pointer to next allocated chunk.
But this way adds space overhead to every dsa_allocated chunk and we maybe want to avoid it because shared memory size is limited.
In this case, we can free these pointer area at some point when we make sure that allocation is successful.

Another debate is when we should think the allocation is successful (when we make sure object won't leak).
If allocation is done in the transaction, we think if transaction is committed we can think it's safe.
Or I assume this DSA memory context for cache such as relcache, catcache, plancache and so on.
In this case cache won't leak once it's added to hash table or list because I assume some eviction mechanism like LRU will be implemented
and it will erase useless cache some time later.

What do you think about these ideas?

Regarding dangling pointer I think it's also problem.
After cleaning up objects to prevent memory leak we don't have mechanism to reset dangling pointer.
On this point I gave some thoughts while ago though begin_allocate/end_allocate don't seem good names.
Maybe more explaining names are like start_pointing_to_dsa_object_under_construction() and end_pointing_to_dsa_object_under_construction().
https://www.postgresql.org/message-id/4E72940DA2BF16479384A86D54D0988A6F1F259F%40G01JPEXMBKW04
If we make sure that such dangling pointer never happen, we don't need to use it.
As Thomas mentioned before, where these interface should be put needs review but couldn't hit upon another solution right now.

Do you have some thoughts?

best regards,
Ideriha, Takeshi

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2019-04-17 05:09:05 Re: Unhappy about API changes in the no-fsm-for-small-rels patch
Previous Message Tom Lane 2019-04-17 04:10:14 Re: Runtime pruning problem