Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Date: 2020-05-28 07:16:08
Message-ID: CAFiTN-snK3V3mRLL_12n8Y5Nta70dn_xxd1u5S85Bdc8K28gMw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, May 26, 2020 at 12:00 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Fri, May 22, 2020 at 6:22 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> >
> > On Mon, May 18, 2020 at 4:10 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > On Sun, May 17, 2020 at 12:41 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> > > >
> > > > On Fri, May 15, 2020 at 4:04 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > > >
> > > > >
> > > > > Review comments:
> > > > > ------------------------------
> > > > > 1.
> > > > > @@ -1762,10 +1952,16 @@ ReorderBufferCommit(ReorderBuffer *rb,
> > > > > TransactionId xid,
> > > > > }
> > > > >
> > > > > case REORDER_BUFFER_CHANGE_MESSAGE:
> > > > > - rb->message(rb, txn, change->lsn, true,
> > > > > - change->data.msg.prefix,
> > > > > - change->data.msg.message_size,
> > > > > - change->data.msg.message);
> > > > > + if (streaming)
> > > > > + rb->stream_message(rb, txn, change->lsn, true,
> > > > > + change->data.msg.prefix,
> > > > > + change->data.msg.message_size,
> > > > > + change->data.msg.message);
> > > > > + else
> > > > > + rb->message(rb, txn, change->lsn, true,
> > > > > + change->data.msg.prefix,
> > > > > + change->data.msg.message_size,
> > > > > + change->data.msg.message);
> > > > >
> > > > > Don't we need to set any_data_sent flag while streaming messages as we
> > > > > do for other types of changes?
> > > >
> > > > I think any_data_sent, was added to avoid sending abort to the
> > > > subscriber if we haven't sent any data, but this is not complete as
> > > > the output plugin can also take the decision not to send. So I think
> > > > this should not be done as part of this patch and can be done
> > > > separately. I think there is already a thread for handling the
> > > > same[1]
> > > >
> > >
> > > Hmm, but prior to this patch, we never use to send (empty) aborts but
> > > now that will be possible. It is probably okay to deal that with
> > > another patch mentioned by you but I felt at least any_data_sent will
> > > work for some cases. OTOH, it appears to be half-baked solution, so
> > > we should probably refrain from adding it. BTW, how do the pgoutput
> > > plugin deal with it? I see that apply_handle_stream_abort will
> > > unconditionally try to unlink the file and it will probably fail.
> > > Have you tested this scenario after your latest changes?
> >
> > Yeah, I see, I think this is a problem, but this exists without my
> > latest change as well, if pgoutput ignore some changes because it is
> > not published then we will see a similar error. Shall we handle the
> > ENOENT error case from unlink?
> Isn't this problem only for subxact file as we anyway create changes
> file as part of start stream message which should have come after
> abort? If so, can't we detect whether subxact file exists probably by
> using nsubxacts or something like that? Can you please once try to
> reproduce this scenario to ensure that we are not missing anything?

I have tested this, as of now, by default we create both changes and
subxact files irrespective of whether we get any subtransactions or
not. Maybe this could be optimized that only if we have any subxact
then only create that file otherwise not? What's your opinion on the
same.

> > > > > 8.
> > > > > @@ -2295,6 +2677,13 @@ ReorderBufferXidSetCatalogChanges(ReorderBuffer
> > > > > *rb, TransactionId xid,
> > > > > txn = ReorderBufferTXNByXid(rb, xid, true, NULL, lsn, true);
> > > > >
> > > > > txn->txn_flags |= RBTXN_HAS_CATALOG_CHANGES;
> > > > > +
> > > > > + /*
> > > > > + * TOCHECK: Mark toplevel transaction as having catalog changes too
> > > > > + * if one of its children has.
> > > > > + */
> > > > > + if (txn->toptxn != NULL)
> > > > > + txn->toptxn->txn_flags |= RBTXN_HAS_CATALOG_CHANGES;
> > > > > }
> > > > >
> > > > > Why are we marking top transaction here?
> > > >
> > > > We need to mark top transaction to decide whether to build tuplecid
> > > > hash or not. In non-streaming mode, we are only sending during the
> > > > commit time, and during commit time we know whether the top
> > > > transaction has any catalog changes or not based on the invalidation
> > > > message so we are marking the top transaction there in DecodeCommit.
> > > > Since here we are not waiting till commit so we need to mark the top
> > > > transaction as soon as we mark any of its child transactions.
> > > >
> > >
> > > But how does it help? We use this flag (via
> > > ReorderBufferXidHasCatalogChanges) in SnapBuildCommitTxn which is
> > > anyway done in DecodeCommit and that too after setting this flag for
> > > the top transaction if required. So, how will it help in setting it
> > > while processing for subxid. Also, even if we have to do it won't it
> > > add the xid needlessly in builder->committed.xip array?
> >
> > In ReorderBufferBuildTupleCidHash, we use this flag to decide whether
> > to build the tuplecid hash or not based on whether it has catalog
> > changes or not.
> >
>
> Okay, but you haven't answered the second part of the question: "won't
> it add the xid of top transaction needlessly in builder->committed.xip
> array, see function SnapBuildCommitTxn?" IIUC, this can happen
> without patch as well because DecodeCommit also sets the flags just
> based on invalidation messages irrespective of whether the messages
> are generated by top transaction or not, is that right?

Yes, with or without the patch it always adds the topxid. I think
purpose for doing this with/without patch is not for the snapshot
instead we are marking the top itself that some of its subtxn has the
catalog changes so that while building the tuplecid has we can know
whether to build the hash or not. But, having said that I feel in
ReorderBufferBuildTupleCidHash why do we need these two checks
if (!rbtxn_has_catalog_changes(txn) || dlist_is_empty(&txn->tuplecids))
return;

I mean it should be enough to just have the check, because if we have
added something to the tuplecids then catalog changes must be there
because that time we are setting the catalog changes to true.

if (dlist_is_empty(&txn->tuplecids))
return;

I think in the base code there are multiple things going on
1. If we get new CID we always set the catalog change in that
transaction but add the tuplecids in the top transaction. So
basically, top transaction is so far not marked with catalog changes
but it has tuplecids.
2. Now, in DecodeCommit the top xid will be marked that it has catalog
changes based on the invalidation messages.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2020-05-28 07:22:33 Re: SIGSEGV from START_REPLICATION 0/XXXXXXX in XLogSendPhysical () at walsender.c:2762
Previous Message Michael Paquier 2020-05-28 07:10:12 Re: BufFileRead() error signalling