Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Date: 2020-02-05 04:12:10
Message-ID: CAFiTN-skHvSWDHV66qpzMfnHH6AvsE2YAjvh4Kt613E8ZD8WoQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 30, 2020 at 4:11 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Fri, Jan 10, 2020 at 10:14 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> >
> > On Mon, Jan 6, 2020 at 2:11 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> >
> > >
> > > Few more comments:
> > > --------------------------------
> > > v4-0007-Implement-streaming-mode-in-ReorderBuffer
> > > 1.
> > > +ReorderBufferStreamTXN(ReorderBuffer *rb, ReorderBufferTXN *txn)
> > > {
> > > ..
> > > + /*
> > > + * TOCHECK: We have to rebuild historic snapshot to be sure it includes all
> > > + * information about
> > > subtransactions, which could arrive after streaming start.
> > > + */
> > > + if (!txn->is_schema_sent)
> > > + snapshot_now
> > > = ReorderBufferCopySnap(rb, txn->base_snapshot,
> > > + txn,
> > > command_id);
> > > ..
> > > }
> > >
> > > Why are we using base snapshot here instead of the snapshot we saved
> > > the first time streaming has happened? And as mentioned in comments,
> > > won't we need to consider the snapshots for subtransactions that
> > > arrived after the last time we have streamed the changes?
> > Fixed
> >
>
> +static void
> +ReorderBufferStreamTXN(ReorderBuffer *rb, ReorderBufferTXN *txn)
> {
> ..
> + /*
> + * We can not use txn->snapshot_now directly because after we there
> + * might be some new sub-transaction which after the last streaming run
> + * so we need to add those sub-xip in the snapshot.
> + */
> + snapshot_now = ReorderBufferCopySnap(rb, txn->snapshot_now,
> + txn, command_id);
>
> "because after we there", you seem to forget a word between 'we' and
> 'there'.
Fixed

So as we are copying it now, does this mean it will consider
> the snapshots for subtransactions that arrived after the last time we
> have streamed the changes? If so, have you tested it and can we add
> the same in comments.
Yes I have tested. Comment added.
>
> Also, if we need to copy the snapshot here, then do we need to again
> copy it in ReorderBufferProcessTXN(in below code and in catch block in
> the same function).
>
> {
> ..
> + /*
> + * Remember the command ID and snapshot if transaction is streaming
> + * otherwise free the snapshot if we have copied it.
> + */
> + if (streaming)
> + {
> + txn->command_id = command_id;
> + txn->snapshot_now = ReorderBufferCopySnap(rb, snapshot_now,
> + txn, command_id);
> + }
> + else if (snapshot_now->copied)
> + ReorderBufferFreeSnap(rb, snapshot_now);
> ..
> }
>
Fixed
> > >
> > > 4. In ReorderBufferStreamTXN(), don't we need to set some of the txn
> > > fields like origin_id, origin_lsn as we do in ReorderBufferCommit()
> > > especially to cover the case when it gets called due to memory
> > > overflow (aka via ReorderBufferCheckMemoryLimit).
> > We get origin_lsn during commit time so I am not sure how can we do
> > that. I have also noticed that currently, we are not using origin_lsn
> > on the subscriber side. I think need more investigation that if we
> > want this then do we need to log it early.
> >
>
> Have you done any investigation of this point? You might want to look
> at pg_replication_origin* APIs. Today, again looking at this code, I
> think with current coding, it won't be used even when we encounter
> commit record. Because ReorderBufferCommit calls
> ReorderBufferStreamCommit which will make sure that origin_id and
> origin_lsn is never sent. I think at least that should be fixed, if
> not, probably, we need a comment with reasoning why we think it is
> okay not to do in this case.
Still, the problem is the same because, currently, we are sending
origin_lsn as part of the "pgoutput_begin" message. Now, for the
streaming transaction,
we have already sent the stream start. However, we might send this
during the stream commit, but I am not completely sure because
currently,
the consumer of this message "apply_handle_origin" is just ignoring
it. I have also looked into pg_replication_origin* APIs and they are
used for setting origin id and
tracking the progress, but they will not consume the origin_lsn we are
sending in pgoutput_begin so this is not directly related.

>
> + /*
> + * If we are streaming the in-progress transaction then Discard the
>
> /Discard/discard
Done
>
> > >
> > > v4-0006-Gracefully-handle-concurrent-aborts-of-uncommitte
> > > 1.
> > > + /*
> > > + * If CheckXidAlive is valid, then we check if it aborted. If it did, we
> > > + * error out
> > > + */
> > > + if (TransactionIdIsValid(CheckXidAlive) &&
> > > + !TransactionIdIsInProgress(CheckXidAlive) &&
> > > + !TransactionIdDidCommit(CheckXidAlive))
> > > + ereport(ERROR,
> > > + (errcode(ERRCODE_TRANSACTION_ROLLBACK),
> > > + errmsg("transaction aborted during system catalog scan")));
> > >
> > > Why here we can't use TransactionIdDidAbort? If we can't use it, then
> > > can you add comments stating the reason of the same.
> > Done
>
> + * If CheckXidAlive is valid, then we check if it aborted. If it did, we
> + * error out. Instead of directly checking the abort status we do check
> + * if it is not in progress transaction and no committed. Because if there
> + * were a system crash then status of the the transaction which were running
> + * at that time might not have marked. So we need to consider them as
> + * aborted. Refer detailed comments at snapmgr.c where the variable is
> + * declared.
>
>
> How about replacing the above comment with below one:
>
> If CheckXidAlive is valid, then we check if it aborted. If it did, we
> error out. We can't directly use TransactionIdDidAbort as after crash
> such transaction might not have been marked as aborted. See detailed
> comments at snapmgr.c where the variable is declared.
Done
>
> I am not able to understand the change in
> v8-0011-BUGFIX-set-final_lsn-for-subxacts-before-cleanup. Do you have
> any explanation for the same?

It appears that in ReorderBufferCommitChild we are always setting the
final_lsn of the subxacts so it should not be invalid. For testing, I
have changed this as an assert and checked but it never hit. So maybe
we can remove this change.

Apart from that, I have fixed the toast tuple streaming bug by setting
the flag bit in the WAL (attached as 0012). I have also extended this
solution for handling the speculative insert bug so old patch for a
speculative insert bug fix is removed. I am also exploring the
solution that how can we do this without setting the flag in the WAL
as we discussed upthread.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
v9-0001-Immediately-WAL-log-assignments.patch application/octet-stream 10.3 KB
v9-0005-Implement-streaming-mode-in-ReorderBuffer.patch application/octet-stream 37.8 KB
v9-0002-Issue-individual-invalidations-with-wal_level-log.patch application/octet-stream 16.4 KB
v9-0004-Gracefully-handle-concurrent-aborts-of-uncommitte.patch application/octet-stream 12.6 KB
v9-0003-Extend-the-output-plugin-API-with-stream-methods.patch application/octet-stream 34.8 KB
v9-0006-Support-logical_decoding_work_mem-set-from-create.patch application/octet-stream 13.1 KB
v9-0007-Add-support-for-streaming-to-built-in-replication.patch application/octet-stream 91.1 KB
v9-0008-Track-statistics-for-streaming.patch application/octet-stream 11.7 KB
v9-0010-BUGFIX-set-final_lsn-for-subxacts-before-cleanup.patch application/octet-stream 1013 bytes
v9-0009-Enable-streaming-for-all-subscription-TAP-tests.patch application/octet-stream 14.7 KB
v9-0011-Add-TAP-test-for-streaming-vs.-DDL.patch application/octet-stream 4.4 KB
v9-0012-Bugfix-handling-of-incomplete-toast-tuple.patch application/octet-stream 13.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2020-02-05 04:15:47 Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Previous Message Amit Kapila 2020-02-05 03:57:41 Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions