Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Erik Rijkers <er(at)xs4all(dot)nl>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Date: 2020-06-24 10:34:12
Message-ID: CAA4eK1KB8e-nzARJ3bZTHwgiu0w0nG_M7DqypPbGjVz0FNP83w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jun 23, 2020 at 7:00 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> Here is the POC patch to discuss the idea of a cleanup of shared
> fileset on proc exit. As discussed offlist, here I am maintaining
> the list of shared fileset. First time when the list is NULL I am
> registering the cleanup function with on_proc_exit routine. After
> that for subsequent fileset, I am just appending it to filesetlist.
> There is also an interface to unregister the shared file set from the
> cleanup list and that is done by the caller whenever we are deleting
> the shared fileset manually. While explaining it here, I think there
> could be one issue if we delete all the element from the list will
> become NULL and on next SharedFileSetInit we will again register the
> function. Maybe that is not a problem but we can avoid registering
> multiple times by using some flag in the file
>

I don't understand what you mean by "using some flag in the file".

Review comments on various patches.

poc_shared_fileset_cleanup_on_procexit
=================================
1.
- ent->subxact_fileset =
- MemoryContextAlloc(ApplyContext, sizeof(SharedFileSet));
+ MemoryContext oldctx;

+ /* Shared fileset handle must be allocated in the persistent context */
+ oldctx = MemoryContextSwitchTo(ApplyContext);
+ ent->subxact_fileset = palloc(sizeof(SharedFileSet));
SharedFileSetInit(ent->subxact_fileset, NULL);
+ MemoryContextSwitchTo(oldctx);
fd = BufFileCreateShared(ent->subxact_fileset, path);

Why is this change required for this patch and why we only cover
SharedFileSetInit in the Apply context and not BufFileCreateShared?
The comment is also not very clear on this point.

2.
+void
+SharedFileSetUnregister(SharedFileSet *input_fileset)
+{
+ bool found = false;
+ ListCell *l;
+
+ Assert(filesetlist != NULL);
+
+ /* Loop over all the pending shared fileset entry */
+ foreach (l, filesetlist)
+ {
+ SharedFileSet *fileset = (SharedFileSet *) lfirst(l);
+
+ /* remove the entry from the list and delete the underlying files */
+ if (input_fileset->number == fileset->number)
+ {
+ SharedFileSetDeleteAll(fileset);
+ filesetlist = list_delete_cell(filesetlist, l);

Why are we calling SharedFileSetDeleteAll here when in the caller we
have already deleted the fileset as per below code?
BufFileDeleteShared(ent->stream_fileset, path);
+ SharedFileSetUnregister(ent->stream_fileset);

I think it will be good if somehow we can remove the fileset from
filesetlist during BufFileDeleteShared. If that is possible, then we
don't need a separate API for SharedFileSetUnregister.

3.
+static List * filesetlist = NULL;
+
static void SharedFileSetOnDetach(dsm_segment *segment, Datum datum);
+static void SharedFileSetOnProcExit(int status, Datum arg);
static void SharedFileSetPath(char *path, SharedFileSet *fileset, Oid
tablespace);
static void SharedFilePath(char *path, SharedFileSet *fileset, const
char *name);
static Oid ChooseTablespace(const SharedFileSet *fileset, const char *name);
@@ -76,6 +80,13 @@ SharedFileSetInit(SharedFileSet *fileset, dsm_segment *seg)
/* Register our cleanup callback. */
if (seg)
on_dsm_detach(seg, SharedFileSetOnDetach, PointerGetDatum(fileset));
+ else
+ {
+ if (filesetlist == NULL)
+ on_proc_exit(SharedFileSetOnProcExit, 0);

We use NIL for list initialization and comparison. See lock_files usage.

4.
+SharedFileSetOnProcExit(int status, Datum arg)
+{
+ ListCell *l;
+
+ /* Loop over all the pending shared fileset entry */
+ foreach (l, filesetlist)
+ {
+ SharedFileSet *fileset = (SharedFileSet *) lfirst(l);
+ SharedFileSetDeleteAll(fileset);
+ }

We can initialize filesetlist as NIL after the for loop as it will
make the code look clean.

Comments on other patches:
=========================
5.
> 3. On concurrent abort we are truncating all the changes including
> some incomplete changes, so later when we get the complete changes we
> don't have the previous changes, e.g, if we had specinsert in the
> last stream and due to concurrent abort detection if we delete that
> changes later we will get spec_confirm without spec insert. We could
> have simply avoided deleting all the changes, but I think the better
> fix is once we detect the concurrent abort for any transaction, then
> why do we need to collect the changes for that, we can simply avoid
> that. So I have put that fix. (0006)
>

On similar lines, I think we need to skip processing message, see else
part of code in ReorderBufferQueueMessage.

6.
In v29-0002-Issue-individual-invalidations-with-wal_level-lo,
xact_desc_invalidations seems to be a subset of
standby_desc_invalidations, can we have a common code for them?

7.
I think we can avoid sending v29-0007-Track-statistics-for-streaming
this each time. We can do this after the main patch is complete.
Also, we might need to change how and where these stats will be
tracked. See the related discussion [1].

8. In v29-0005-Implement-streaming-mode-in-ReorderBuffer,
* Return oldest transaction in reorderbuffer
@@ -863,6 +909,9 @@ ReorderBufferAssignChild(ReorderBuffer *rb,
TransactionId xid,
/* set the reference to top-level transaction */
subtxn->toptxn = txn;

+ /* set the reference to toplevel transaction */
+ subtxn->toptxn = txn;
+

There is a double initialization of subtxn->toptxn. You need to
remove this line from 0005 patch as we have now added it in an earlier
patch.

9. I think you forgot to update the patch to execute invalidations in
Abort case or I might be missing something. I don't see any changes
in ReorderBufferAbort. You have agreed in one of the emails above [2]
about handling the same.

10. In v29-0008-Add-support-for-streaming-to-built-in-replicatio,
apply_handle_stream_commit(StringInfo s)
{
..
+ /*
+ * send feedback to upstream
+ *
+ * XXX Probably should send a valid LSN. But which one?
+ */
+ send_feedback(InvalidXLogRecPtr, false, false);
..
}

I have given a comment on this code that we don't need this feedback
and you mentioned on June 02 [3] that you will think on it and let me
know your opinion but I don't see a response from you yet. Can you
get back to me regarding this point?

11. Add some comments as to why we have used Shared BufFile interface
instead of Temp BufFile interface?

12. In v29-0013-Change-buffile-interface-required-for-streaming,
+ * Initialize a space for temporary files that can be opened other backends.

/opened other backends/opened for access by other backends

[1] - https://www.postgresql.org/message-id/CA%2Bfd4k5_pPAYRTDrO2PbtTOe0eHQpBvuqmCr8ic39uTNmR49Eg%40mail.gmail.com
[2] - https://www.postgresql.org/message-id/CAFiTN-t7WZZjFrAjSYj4fu%3DFZ2JKENN8ZHCUZaw-srnrHMWMrg%40mail.gmail.com
[3] - https://www.postgresql.org/message-id/CAFiTN-tHpd%2BzXVemo9WqQUJS50p9m8jD%3DAWjsugKZQ4F-K8Pbw%40mail.gmail.com

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2020-06-24 10:40:13 Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Previous Message Bharath Rupireddy 2020-06-24 09:50:34 [PATCH] COPY command's data format option allows only lowercase csv, text or binary