Re: [BUG] "FailedAssertion" reported when streaming in logical replication

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [BUG] "FailedAssertion" reported when streaming in logical replication
Date: 2021-04-27 06:20:01
Message-ID: CAFiTN-sN4UU6b4vVOgVsZk_L_2NxbutbU3vxR5E8EUOsJrkimA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Apr 27, 2021 at 11:43 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Mon, Apr 26, 2021 at 7:52 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> >
> > On Mon, Apr 26, 2021 at 6:59 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > On Mon, Apr 26, 2021 at 5:55 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> > > >
> > > > I am able to reproduce this and I think I have done the initial investigation.
> > > >
> > > > The cause of the issue is that, this transaction has only one change
> > > > and that change is XLOG_HEAP2_NEW_CID, which is added through
> > > > SnapBuildProcessNewCid. Basically, when we add any changes through
> > > > SnapBuildProcessChange we set the base snapshot but when we add
> > > > SnapBuildProcessNewCid this we don't set the base snapshot, because
> > > > there is nothing to be done for this change. Now, this transaction is
> > > > identified as the biggest transaction with non -partial changes, and
> > > > now in ReorderBufferStreamTXN, it will return immediately because the
> > > > base_snapshot is NULL.
> > > >
> > >
> > > Your analysis sounds correct to me.
> > >
> >
> > Thanks, I have attached a patch to fix this.
> >
>
> Can't we use 'txns_by_base_snapshot_lsn' list for this purpose? It is
> ensured in ReorderBufferSetBaseSnapshot that we always assign
> base_snapshot to a top-level transaction if the current is a known
> subxact. I think that will be true because we always form xid-subxid
> relation before processing each record in
> LogicalDecodingProcessRecord.

Yeah, we can do that, but here we are only interested in top
transactions and this list will give us sub-transaction as well so we
will have to skip it in the below if condition. So I think using
toplevel_by_lsn and skipping the txn without base_snapshot in below if
condition will be cheaper compared to process all the transactions
with base snapshot i.e. txns_by_base_snapshot_lsn and skipping the
sub-transactions in the below if conditions. Whats your thoughts on
this?

> Few other minor comments:
> 1. I think we can update the comments atop function ReorderBufferLargestTopTXN.
> 2. minor typo in comments atop ReorderBufferLargestTopTXN "...There is
> a scope of optimization here such that we can select the largest
> transaction which has complete changes...". In this 'complete' should
> be incomplete. This is not related to this patch but I think we can
> fix it along with this because anyway we are going to change
> surrounding comments.

I will work on these in the next version.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2021-04-27 06:22:25 Small issues with CREATE TABLE COMPRESSION
Previous Message Amit Kapila 2021-04-27 06:13:47 Re: [BUG] "FailedAssertion" reported when streaming in logical replication