Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Erik Rijkers <er(at)xs4all(dot)nl>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Date: 2020-05-29 09:24:11
Message-ID: CAA4eK1LPS5yrUoa-CJFHA-QYdjs=NEAJZJ+bNjuJmu7Xo0qVig@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, May 27, 2020 at 8:22 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
>
> While reviewing/testing I have found a couple of problems in 0005 and
> 0006 which I have fixed in the attached version.
>
..
>
> In 0006: If we are streaming the serialized changed and there are
> still few incomplete changes, then currently we are not deleting the
> spilled file, but the spill file contains all the changes of the
> transaction because there is no way to partially truncate it. So in
> the next stream, it will try to resend those. I have fixed this by
> sending the spilled transaction as soon as its changes are complete so
> ideally, we can always delete the spilled file. It is also a better
> solution because this transaction is already spilled once and that
> happened because we could not stream it, so we better stream it on
> the first opportunity that will reduce the replay lag which is our
> whole purpose here.
>

I have reviewed these changes (in the patch
v25-0006-Bugfix-handling-of-incomplete-toast-spec-insert-) and below
are my comments.

1.
+ /*
+ * If the transaction is serialized and the the changes are complete in
+ * the top level transaction then immediately stream the transaction.
+ * The reason for not waiting for memory limit to get full is that in
+ * the streaming mode, if the transaction serialized that means we have
+ * already reached the memory limit but that time we could not stream
+ * this due to incomplete tuple so now stream it as soon as the tuple
+ * is complete.
+ */
+ if (rbtxn_is_serialized(txn))
+ ReorderBufferStreamTXN(rb, toptxn);

I think here it is important to explain why it is a must to stream a
prior serialized transaction as otherwise, later we won't be able to
know how to truncate a file.

2.
+ * If complete_truncate is set we completely truncate the transaction,
+ * otherwise we truncate upto last_complete_lsn if the transaction has
+ * incomplete changes. Basically, complete_truncate is passed true only if
+ * concurrent abort is detected while processing the TXN.
*/
static void
-ReorderBufferTruncateTXN(ReorderBuffer *rb, ReorderBufferTXN *txn)
+ReorderBufferTruncateTXN(ReorderBuffer *rb, ReorderBufferTXN *txn,
+ bool partial_truncate)
{

The description talks about complete_truncate flag whereas API is
using partial_truncate flag. I think the description needs to be
changed.

3.
+ /* We have truncated upto last complete lsn so stop. */
+ if (partial_truncate && rbtxn_has_incomplete_tuple(toptxn) &&
+ (change->lsn > toptxn->last_complete_lsn))
+ {
+ /*
+ * If this is a top transaction then we can reset the
+ * last_complete_lsn and complete_size, because by now we would
+ * have stream all the changes upto last_complete_lsn.
+ */
+ if (txn->toptxn == NULL)
+ {
+ toptxn->last_complete_lsn = InvalidXLogRecPtr;
+ toptxn->complete_size = 0;
+ }
+ break;
+ }

I think here we can add an Assert to ensure that we don't partially
truncate when the transaction is serialized and add comments for the
same.

4.
+ /*
+ * Subtract the processed changes from the nentries/nentries_mem Refer
+ * detailed comment atop this variable in ReorderBufferTXN structure.
+ * We do this only ff we are truncating the partial changes otherwise
+ * reset these values directly to 0.
+ */
+ if (partial_truncate)
+ {
+ txn->nentries -= txn->nprocessed;
+ txn->nentries_mem -= txn->nprocessed;
+ }
+ else
+ {
+ txn->nentries = 0;
+ txn->nentries_mem = 0;
+ }

I think we can write this comment as "Adjust nentries/nentries_mem
based on the changes processed. See comments where nprocessed is
declared."

5.
+ /*
+ * In streaming mode, sometime we can't stream all the changes due to the
+ * incomplete changes. So we can not directly reset the values of
+ * nentries/nentries_mem to 0 after one stream is sent like we do in
+ * non-streaming mode. So while sending one stream we keep count of the
+ * changes processed in thi stream and only those many changes we decrement
+ * from the nentries/nentries_mem.
+ */
+ uint64 nprocessed;

How about something like: "Number of changes processed. This is used
to keep track of changes that remained to be streamed. As of now,
this can happen either due to toast tuples or speculative insertions
where we need to wait for multiple changes before we can send them."

6.
+ /* Size of the commplete changes. */
+ Size complete_size;

Typo. /commplete/complete

7.
+ /*
+ * Increment the nprocessed count. See the detailed comment
+ * for usage of this in ReorderBufferTXN structure.
+ */
+ change->txn->nprocessed++;

Ideally, this has to be incremented after processing the change. So,
we can combine it with existing check in the patch as below:

if (streaming)
{
change->txn->nprocessed++;

if (rbtxn_has_incomplete_tuple(txn) &&
prev_lsn == txn->last_complete_lsn)
{
/* Only in streaming mode we should get here. */
Assert(streaming);
partial_truncate = true;
break;
}
}

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2020-05-29 10:22:19 Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Previous Message Masahiko Sawada 2020-05-29 09:09:06 Re: SIGSEGV from START_REPLICATION 0/XXXXXXX in XLogSendPhysical () at walsender.c:2762