Re: Fix a bug in DecodeAbort() and improve input data check on subscriber.

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Subject: Re: Fix a bug in DecodeAbort() and improve input data check on subscriber.
Date: 2021-12-07 09:05:05
Message-ID: CALDaNm1wX4JmbyonLkL0KEE=ixvA-zyB90f-KtY=Xm4PAXC10A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 7, 2021 at 6:07 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> Hi all,
>
> While updating the patch I recently posted[1] to make pg_waldump
> report replication origin ID, LSN, and timestamp, I found a bug that
> replication origin timestamp is not set in ROLLBACK PREPARED case.
> Commit 8bdb1332eb5 (CC'ed Amit) added an argument to
> ReorderBufferFinishPrepared() but in ROLLBACK PREPARED case, the
> caller specified it at the wrong position:
>
> @@ -730,6 +730,7 @@ DecodeCommit(LogicalDecodingContext *ctx,
> XLogRecordBuffer *buf,
> if (two_phase)
> {
> ReorderBufferFinishPrepared(ctx->reorder, xid, buf->origptr,
> buf->endptr,
> +
> SnapBuildInitialConsistentPoint(ctx->snapshot_builder),
> commit_time, origin_id, origin_lsn,
> parsed->twophase_gid, true);
> }
> @@ -868,6 +869,7 @@ DecodeAbort(LogicalDecodingContext *ctx,
> XLogRecordBuffer *buf,
> {
> ReorderBufferFinishPrepared(ctx->reorder, xid, buf->origptr,
> buf->endptr,
> abort_time, origin_id, origin_lsn,
> + InvalidXLogRecPtr,
> parsed->twophase_gid, false);
> }
>
> This affects the value of rollback_data.rollback_time on the
> subscriber, resulting in setting the wrong timestamp to both the
> replication origin timestamp and the error callback argument on the
> subscriber. I've attached the patch to fix it.
>
> Besides, I think we can improve checking input data on subscribers.
> This bug was not detected by compilers but it could have been detected
> if we checked the input data. Looking at logicalrep_read_xxx functions
> in proto.c, there are some inconsistencies; we check the value of
> prepare_data->xid in logicalrep_read_prepare_common() but we don't in
> both logicalrep_read_commit_prepared() and
> logicalrep_read_rollback_prepared(), and we don't check anything in
> stream_start/stream_stop. Also, IIUC that since timestamps are always
> set in prepare/commit prepared/rollback prepared cases we can check
> them too. I've attached a PoC patch that introduces macros for
> checking input data and adds some new checks. Since it could be
> frequently called, I used unlikely() but probably we can also consider
> replacing elog(ERROR) with assertions.

The first patch is required as suggested and fixes the problem.
Few comments on the second patch:
1) Should we validate prepare_time and xid in
logicalrep_read_begin_prepare. Is this not checked intentionally?
2) Similarly should we validate committime in logicalrep_read_stream_commit?

Regards,
Vignesh

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Ronan Dunklau 2021-12-07 09:39:43 Re: Non-superuser subscription owners
Previous Message kuroda.hayato@fujitsu.com 2021-12-07 08:55:59 RE: Allow escape in application_name