| From: | "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com> | 
|---|---|
| To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> | 
| Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "Drouvot, Bertrand" <bdrouvot(at)amazon(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Oh, Mike" <minsoo(at)amazon(dot)com> | 
| Subject: | RE: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns | 
| Date: | 2022-07-14 02:16:00 | 
| Message-ID: | OSZPR01MB63102143FE2431519DEF7AFDFD889@OSZPR01MB6310.jpnprd01.prod.outlook.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
On Tue, Jul 12, 2022 5:23 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> 
> On Tue, Jul 12, 2022 at 5:58 PM shiy(dot)fnst(at)fujitsu(dot)com
> <shiy(dot)fnst(at)fujitsu(dot)com> wrote:
> >
> > It happened when executing the following code because it tried to free a
> NULL
> > pointer (catchange_xip).
> >
> >         /* be tidy */
> >         if (ondisk)
> >                 pfree(ondisk);
> > +       if (catchange_xip)
> > +               pfree(catchange_xip);
> >  }
> >
> > It seems to be related to configure option. I could reproduce it when using
> > `./configure --enable-debug`.
> > But I couldn't reproduce with `./configure --enable-debug CFLAGS="-Og -
> ggdb"`.
> 
> Hmm, I could not reproduce this problem even if I use ./configure
> --enable-debug. And it's weird that we checked if catchange_xip is not
> null but we did pfree for it:
> 
> #1  pfree (pointer=0x0) at mcxt.c:1177
> #2  0x000000000078186b in SnapBuildSerialize (builder=0x1fd5e78,
> lsn=25719712) at snapbuild.c:1792
> 
> Is it reproducible in your environment?
Thanks for your reply! Yes, it is reproducible. And I also reproduced it on the
v4 patch you posted [1].
> If so, could you test it again
> with the following changes?
> 
> diff --git a/src/backend/replication/logical/snapbuild.c
> b/src/backend/replication/logical/snapbuild.c
> index d015c06ced..a6e76e3781 100644
> --- a/src/backend/replication/logical/snapbuild.c
> +++ b/src/backend/replication/logical/snapbuild.c
> @@ -1788,7 +1788,7 @@ out:
>     /* be tidy */
>     if (ondisk)
>         pfree(ondisk);
> -   if (catchange_xip)
> +   if (catchange_xip != NULL)
>         pfree(catchange_xip);
>  }
> 
I tried this and could still reproduce the problem.
Besides, I tried the suggestion from Amit [2],  it could be fixed by checking
the value of catchange_xcnt instead of catchange_xip before pfree.
diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index c482e906b0..68b9c4ef7d 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -1573,7 +1573,7 @@ SnapBuildSerialize(SnapBuild *builder, XLogRecPtr lsn)
        Size            needed_length;
        SnapBuildOnDisk *ondisk = NULL;
        TransactionId   *catchange_xip = NULL;
-       size_t          catchange_xcnt;
+       size_t          catchange_xcnt = 0;
        char       *ondisk_c;
        int                     fd;
        char            tmppath[MAXPGPATH];
@@ -1788,7 +1788,7 @@ out:
        /* be tidy */
        if (ondisk)
                pfree(ondisk);
-       if (catchange_xip)
+       if (catchange_xcnt != 0)
                pfree(catchange_xip);
 }
Regards,
Shi yu
| From | Date | Subject | |
|---|---|---|---|
| Next Message | John Naylor | 2022-07-14 02:40:25 | Re: i.e. and e.g. | 
| Previous Message | Masahiko Sawada | 2022-07-14 01:32:06 | Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns |