Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "Drouvot, Bertrand" <bdrouvot(at)amazon(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Oh, Mike" <minsoo(at)amazon(dot)com>
Subject: Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
Date: 2022-07-07 02:50:36
Message-ID: CAD21AoBHo3WQ+vaeiNzXuNw__08veBT9qYgCUD3jNLWG=4GzFw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jul 6, 2022 at 5:55 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Wed, Jul 6, 2022 at 12:19 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > On Tue, Jul 5, 2022 at 8:00 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > 2. Are we anytime removing transaction ids from catchanges->xip array?
> >
> > No.
> >
> > > If not, is there a reason for the same? I think we can remove it
> > > either at commit/abort or even immediately after adding the xid/subxid
> > > to committed->xip array.
> >
> > It might be a good idea but I'm concerned that removing XID from the
> > array at every commit/abort or after adding it to committed->xip array
> > might be costly as it requires adjustment of the array to keep its
> > order. Removing XIDs from the array would make bsearch faster but the
> > array is updated reasonably often (every 15 sec).
> >
>
> Fair point. However, I am slightly worried that we are unnecessarily
> searching in this new array even when ReorderBufferTxn has the
> required information. To avoid that, in function
> SnapBuildXidHasCatalogChange(), we can first check
> ReorderBufferXidHasCatalogChanges() and then check the array if the
> first check doesn't return true. Also, by the way, do we need to
> always keep builder->catchanges.xip updated via SnapBuildRestore()?
> Isn't it sufficient that we just read and throw away contents from a
> snapshot if builder->catchanges.xip is non-NULL?

IIUC catchanges.xip is restored only once when restoring a consistent
snapshot via SnapBuildRestore(). I think it's necessary to set
catchanges.xip for later use in SnapBuildXidHasCatalogChange(). Or did
you mean via SnapBuildSerialize()?∫

>
> I had additionally thought if can further optimize this solution to
> just store this additional information when we need to serialize for
> checkpoint record but I think that won't work because walsender can
> restart even without resatart of server in which case the same problem
> can occur.

Yes, probably we need to write catalog modifying transactions for
every serialized snapshot.

> I am not if sure there is a way to further optimize this
> solution, let me know if you have any ideas?

I suppose that writing additional information to serialized snapshots
would not be a noticeable overhead since we need 4 bytes per
transaction and we would not expect there is a huge number of
concurrent catalog modifying transactions. But both collecting catalog
modifying transactions (especially when there are many ongoing
transactions) and bsearch'ing on the XID list every time decoding the
COMMIT record could bring overhead.

A solution for the first point would be to keep track of catalog
modifying transactions by using a linked list so that we can avoid
checking all ongoing transactions.

Regarding the second point, on reflection, I think we need to look up
the XID list until all XID in the list is committed/aborted. We can
remove XIDs from the list after adding it to committed.xip as you
suggested. Or when decoding a RUNNING_XACTS record, we can remove XIDs
older than builder->xmin from the list like we do for committed.xip in
SnapBuildPurgeCommittedTxn().

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-07-07 03:06:59 Re: BUG #17540: Prepared statement: PG switches to a generic query plan which is consistently much slower
Previous Message David Rowley 2022-07-07 02:13:54 Re: tuplesort Generation memory contexts don't play nicely with index builds