Re: Forget close an open relation in ReorderBufferProcessTXN()

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Japin Li <japinli(at)hotmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Forget close an open relation in ReorderBufferProcessTXN()
Date: 2021-05-21 07:42:42
Message-ID: CA+HiwqHthnh2G8XFBV1++JyP34ir7OGSznYPV2uMWsQK40-JoA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, May 21, 2021 at 3:55 PM osumi(dot)takamichi(at)fujitsu(dot)com
<osumi(dot)takamichi(at)fujitsu(dot)com> wrote:
> On Thursday, May 20, 2021 9:59 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> > Here are updated/divided patches.
> Thanks for your updates.
>
> But, I've detected segmentation faults caused by the patch,
> which can happen during 100_bugs.pl in src/test/subscription.
> This happens more than one in ten times.
>
> This problem would be a timing issue and has been introduced by v3 already.
> I used v5 for HEAD also and reproduced this failure, while
> OSS HEAD doesn't reproduce this, even when I executed 100_bugs.pl 200 times in a tight loop.
> I aligned the commit id 4f586fe2 for all check. Below logs are ones I got from v3.
>
> My first guess of the cause is that between the timing to get an entry from hash_search() in get_rel_sync_entry()
> and to set the map by convert_tuples_by_name() in maybe_send_schema(), we had invalidation message,
> which tries to free unset descs in the entry ?

Hmm, maybe get_rel_syn_entry() should explicitly set map to NULL when
first initializing an entry. It's possible that without doing so, the
map remains set to a garbage value, which causes the invalidation
callback that runs into such partially initialized entry to segfault
upon trying to deference that garbage pointer.

I've tried that in the attached v6 patches. Please check.

--
Amit Langote
EDB: http://www.enterprisedb.com

Attachment Content-Type Size
HEAD-v6-0001-pgoutput-fix-memory-management-of-RelationSyncEnt.patch application/octet-stream 5.4 KB
HEAD-v6-0002-pgoutput-don-t-send-leaf-partition-schema-when-pu.patch application/octet-stream 2.1 KB
PG13-v6-0002-pgoutput-don-t-send-leaf-partition-schema-when-pu.patch application/octet-stream 2.0 KB
PG13-v6-0001-pgoutput-fix-memory-management-for-RelationSyncEn.patch application/octet-stream 5.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2021-05-21 07:49:24 Re: Race condition in recovery?
Previous Message osumi.takamichi@fujitsu.com 2021-05-21 07:26:32 RE: Forget close an open relation in ReorderBufferProcessTXN()