Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Date: 2020-05-22 12:51:50
Message-ID: CAFiTN-sC=igkzewbcPUROXd4xP_bsFRe1=VVeRGOHB-jB2sssw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, May 18, 2020 at 4:10 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Sun, May 17, 2020 at 12:41 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> >
> > On Fri, May 15, 2020 at 4:04 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > >
> > > Review comments:
> > > ------------------------------
> > > 1.
> > > @@ -1762,10 +1952,16 @@ ReorderBufferCommit(ReorderBuffer *rb,
> > > TransactionId xid,
> > > }
> > >
> > > case REORDER_BUFFER_CHANGE_MESSAGE:
> > > - rb->message(rb, txn, change->lsn, true,
> > > - change->data.msg.prefix,
> > > - change->data.msg.message_size,
> > > - change->data.msg.message);
> > > + if (streaming)
> > > + rb->stream_message(rb, txn, change->lsn, true,
> > > + change->data.msg.prefix,
> > > + change->data.msg.message_size,
> > > + change->data.msg.message);
> > > + else
> > > + rb->message(rb, txn, change->lsn, true,
> > > + change->data.msg.prefix,
> > > + change->data.msg.message_size,
> > > + change->data.msg.message);
> > >
> > > Don't we need to set any_data_sent flag while streaming messages as we
> > > do for other types of changes?
> >
> > I think any_data_sent, was added to avoid sending abort to the
> > subscriber if we haven't sent any data, but this is not complete as
> > the output plugin can also take the decision not to send. So I think
> > this should not be done as part of this patch and can be done
> > separately. I think there is already a thread for handling the
> > same[1]
> >
>
> Hmm, but prior to this patch, we never use to send (empty) aborts but
> now that will be possible. It is probably okay to deal that with
> another patch mentioned by you but I felt at least any_data_sent will
> work for some cases. OTOH, it appears to be half-baked solution, so
> we should probably refrain from adding it. BTW, how do the pgoutput
> plugin deal with it? I see that apply_handle_stream_abort will
> unconditionally try to unlink the file and it will probably fail.
> Have you tested this scenario after your latest changes?

Yeah, I see, I think this is a problem, but this exists without my
latest change as well, if pgoutput ignore some changes because it is
not published then we will see a similar error. Shall we handle the
ENOENT error case from unlink? I think the best idea is that we shall
track the empty transaction.

> > > 4.
> > > In ReorderBufferProcessTXN(), the patch is calling stream_stop in both
> > > the try and catch block. If there is an error after calling it in a
> > > try block, we might call it again via catch. I think that will lead
> > > to sending a stop message twice. Won't that be a problem? See the
> > > usage of iterstate in the catch block, we have made it safe from a
> > > similar problem.
> >
> > IMHO, we don't need that, because we only call stream_stop in the
> > catch block if the error type is ERRCODE_TRANSACTION_ROLLBACK. So if
> > in TRY block we have already stopped the stream then we should not get
> > that error. I have added the comments for the same.
> >
>
> I am still slightly nervous about it as I don't see any solid
> guarantee for the same. You are right as the code stands today but
> due to any code that gets added in the future, it might not remain
> true. I feel it is better to have an Assert here to ensure that
> stream_stop won't be called the second time. I don't see any good way
> of doing it other than by maintaining flag or some state but I think
> it will be good to ensure this.

Done

> > > 6.
> > > PG_CATCH();
> > > {
> > > + MemoryContext ecxt = MemoryContextSwitchTo(ccxt);
> > > + ErrorData *errdata = CopyErrorData();
> > >
> > > I don't understand the usage of memory context in this part of the
> > > code. Basically, you are switching to CurrentMemoryContext here, do
> > > some error handling and then again reset back to some random context
> > > before rethrowing the error. If there is some purpose for it, then it
> > > might be better if you can write a few comments to explain the same.
> >
> > Basically, the ccxt is the CurrentMemoryContext when we started the
> > streaming and ecxt it the context when we catch the error. So
> > ideally, before this change, it will rethrow in the context when we
> > catch the error i.e. ecxt. So what we are trying to do is put it back
> > to normal context (ccxt) and copy the error data in the normal
> > context. And, if we are not handling it gracefully then put it back
> > to the context it was in, and rethrow.
> >
>
> Okay, but when errorcode is *not* ERRCODE_TRANSACTION_ROLLBACK, don't
> we need to clean up the reorderbuffer by calling
> ReorderBufferCleanupTXN? If so, then you can try to combine it with
> the not-streaming else loop.

Done

> > > 8.
> > > @@ -2295,6 +2677,13 @@ ReorderBufferXidSetCatalogChanges(ReorderBuffer
> > > *rb, TransactionId xid,
> > > txn = ReorderBufferTXNByXid(rb, xid, true, NULL, lsn, true);
> > >
> > > txn->txn_flags |= RBTXN_HAS_CATALOG_CHANGES;
> > > +
> > > + /*
> > > + * TOCHECK: Mark toplevel transaction as having catalog changes too
> > > + * if one of its children has.
> > > + */
> > > + if (txn->toptxn != NULL)
> > > + txn->toptxn->txn_flags |= RBTXN_HAS_CATALOG_CHANGES;
> > > }
> > >
> > > Why are we marking top transaction here?
> >
> > We need to mark top transaction to decide whether to build tuplecid
> > hash or not. In non-streaming mode, we are only sending during the
> > commit time, and during commit time we know whether the top
> > transaction has any catalog changes or not based on the invalidation
> > message so we are marking the top transaction there in DecodeCommit.
> > Since here we are not waiting till commit so we need to mark the top
> > transaction as soon as we mark any of its child transactions.
> >
>
> But how does it help? We use this flag (via
> ReorderBufferXidHasCatalogChanges) in SnapBuildCommitTxn which is
> anyway done in DecodeCommit and that too after setting this flag for
> the top transaction if required. So, how will it help in setting it
> while processing for subxid. Also, even if we have to do it won't it
> add the xid needlessly in builder->committed.xip array?

In ReorderBufferBuildTupleCidHash, we use this flag to decide whether
to build the tuplecid hash or not based on whether it has catalog
changes or not.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2020-05-22 12:52:08 Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Previous Message Dilip Kumar 2020-05-22 12:51:36 Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions