Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Erik Rijkers <er(at)xs4all(dot)nl>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Date: 2020-05-29 15:00:42
Message-ID: CAFiTN-vXQx_161WC-a9HvNaF25nwO=JJRpRdTtyfGQHbM3Bd1Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, May 29, 2020 at 2:54 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Wed, May 27, 2020 at 8:22 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> >
> >
> > While reviewing/testing I have found a couple of problems in 0005 and
> > 0006 which I have fixed in the attached version.
> >
> ..
> >
> > In 0006: If we are streaming the serialized changed and there are
> > still few incomplete changes, then currently we are not deleting the
> > spilled file, but the spill file contains all the changes of the
> > transaction because there is no way to partially truncate it. So in
> > the next stream, it will try to resend those. I have fixed this by
> > sending the spilled transaction as soon as its changes are complete so
> > ideally, we can always delete the spilled file. It is also a better
> > solution because this transaction is already spilled once and that
> > happened because we could not stream it, so we better stream it on
> > the first opportunity that will reduce the replay lag which is our
> > whole purpose here.
> >
>
> I have reviewed these changes (in the patch
> v25-0006-Bugfix-handling-of-incomplete-toast-spec-insert-) and below
> are my comments.
>
> 1.
> + /*
> + * If the transaction is serialized and the the changes are complete in
> + * the top level transaction then immediately stream the transaction.
> + * The reason for not waiting for memory limit to get full is that in
> + * the streaming mode, if the transaction serialized that means we have
> + * already reached the memory limit but that time we could not stream
> + * this due to incomplete tuple so now stream it as soon as the tuple
> + * is complete.
> + */
> + if (rbtxn_is_serialized(txn))
> + ReorderBufferStreamTXN(rb, toptxn);
>
> I think here it is important to explain why it is a must to stream a
> prior serialized transaction as otherwise, later we won't be able to
> know how to truncate a file.

Done

> 2.
> + * If complete_truncate is set we completely truncate the transaction,
> + * otherwise we truncate upto last_complete_lsn if the transaction has
> + * incomplete changes. Basically, complete_truncate is passed true only if
> + * concurrent abort is detected while processing the TXN.
> */
> static void
> -ReorderBufferTruncateTXN(ReorderBuffer *rb, ReorderBufferTXN *txn)
> +ReorderBufferTruncateTXN(ReorderBuffer *rb, ReorderBufferTXN *txn,
> + bool partial_truncate)
> {
>
> The description talks about complete_truncate flag whereas API is
> using partial_truncate flag. I think the description needs to be
> changed.

Fixed

> 3.
> + /* We have truncated upto last complete lsn so stop. */
> + if (partial_truncate && rbtxn_has_incomplete_tuple(toptxn) &&
> + (change->lsn > toptxn->last_complete_lsn))
> + {
> + /*
> + * If this is a top transaction then we can reset the
> + * last_complete_lsn and complete_size, because by now we would
> + * have stream all the changes upto last_complete_lsn.
> + */
> + if (txn->toptxn == NULL)
> + {
> + toptxn->last_complete_lsn = InvalidXLogRecPtr;
> + toptxn->complete_size = 0;
> + }
> + break;
> + }
>
> I think here we can add an Assert to ensure that we don't partially
> truncate when the transaction is serialized and add comments for the
> same.

Done

> 4.
> + /*
> + * Subtract the processed changes from the nentries/nentries_mem Refer
> + * detailed comment atop this variable in ReorderBufferTXN structure.
> + * We do this only ff we are truncating the partial changes otherwise
> + * reset these values directly to 0.
> + */
> + if (partial_truncate)
> + {
> + txn->nentries -= txn->nprocessed;
> + txn->nentries_mem -= txn->nprocessed;
> + }
> + else
> + {
> + txn->nentries = 0;
> + txn->nentries_mem = 0;
> + }
>
> I think we can write this comment as "Adjust nentries/nentries_mem
> based on the changes processed. See comments where nprocessed is
> declared."
>
> 5.
> + /*
> + * In streaming mode, sometime we can't stream all the changes due to the
> + * incomplete changes. So we can not directly reset the values of
> + * nentries/nentries_mem to 0 after one stream is sent like we do in
> + * non-streaming mode. So while sending one stream we keep count of the
> + * changes processed in thi stream and only those many changes we decrement
> + * from the nentries/nentries_mem.
> + */
> + uint64 nprocessed;
>
> How about something like: "Number of changes processed. This is used
> to keep track of changes that remained to be streamed. As of now,
> this can happen either due to toast tuples or speculative insertions
> where we need to wait for multiple changes before we can send them."

Done

> 6.
> + /* Size of the commplete changes. */
> + Size complete_size;
>
> Typo. /commplete/complete
>
> 7.
> + /*
> + * Increment the nprocessed count. See the detailed comment
> + * for usage of this in ReorderBufferTXN structure.
> + */
> + change->txn->nprocessed++;
>
> Ideally, this has to be incremented after processing the change. So,
> we can combine it with existing check in the patch as below:
>
> if (streaming)
> {
> change->txn->nprocessed++;
>
> if (rbtxn_has_incomplete_tuple(txn) &&
> prev_lsn == txn->last_complete_lsn)
> {
> /* Only in streaming mode we should get here. */
> Assert(streaming);
> partial_truncate = true;
> break;
> }
> }

Done

Apart from this, there was one more issue in this patch
+ if (partial_truncate && rbtxn_has_incomplete_tuple(toptxn) &&
+ (change->lsn > toptxn->last_complete_lsn))
+ {
+ /*
+ * If this is a top transaction then we can reset the
+ * last_complete_lsn and complete_size, because by now we would
+ * have stream all the changes upto last_complete_lsn.
+ */
+ if (txn->toptxn == NULL)
+ {
+ toptxn->last_complete_lsn = InvalidXLogRecPtr;
+ toptxn->complete_size = 0;
+ }
+ break;

We shall reset toptxn->last_complete_lsn and toptxn->complete_size,
outside this {(change->lsn > toptxn->last_complete_lsn)} check,
because we might be in subxact when we meet this condition, so in that
case, for toptxn we never reach here and it will never get reset, I
have fixed this.

Apart from this one more fix in 0005, basically, CheckLiveXid was
never reset, so I have fixed that as well.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
v26.tar application/x-tar 282.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2020-05-29 15:01:09 Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Previous Message Pavel Stehule 2020-05-29 14:59:42 Re: PG function with pseudotype "anyelement" for IN, OUT parameter shows wrong behaviour.