Re: BUG in 10.1 - dsa_area could not attach to a segment that has been freed

From: Alexander Voytsekhovskyy <av(at)mobile-ua(dot)com>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG in 10.1 - dsa_area could not attach to a segment that has been freed
Date: 2017-12-07 20:55:56
Message-ID: 21FE7F08-BFD3-47DB-8F68-184F02778273@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi,

i can confirm that this patch fix the bug for me.

I can’t provide minimal reproducer, but i can share whole data set with you

> On Thu, Nov 30, 2017 at 10:18 AM, Thomas Munro
> <thomas(dot)munro(at)enterprisedb(dot)com> wrote:
>> On Thu, Nov 30, 2017 at 9:34 AM, Alexander Voytsekhovskyy
>> <young(dot)inbox(at)gmail(dot)com> wrote:
>>> Thanks for helping, here is one more try
>>>
>>> #0 get_segment_by_index (area=area(at)entry=0x556026700be8, index=1) at
>>> /build/postgresql-10-qAeTPy/postgresql-10-10.1/build/../src/backend/utils/mmgr/dsa.c:1736
>>> #1 0x00005560252c2b90 in dsa_get_address (area=area(at)entry=0x556026700be8,
>>> dp=dp(at)entry=1099511685280) at
>>> /build/postgresql-10-qAeTPy/postgresql-10-10.1/build/../src/backend/utils/mmgr/dsa.c:945
>>> #2 0x00005560250a2c2b in tbm_attach_shared_iterate
>>> (dsa=dsa(at)entry=0x556026700be8, dp=1099511685280) at
>>> /build/postgresql-10-qAeTPy/postgresql-10-10.1/build/../src/backend/nodes/tidbitmap.c:1503
>>> #3 0x0000556025066c7b in BitmapHeapNext (node=node(at)entry=0x556026460710) at
>>> /build/postgresql-10-qAeTPy/postgresql-10-10.1/build/../src/backend/executor/nodeBitmapHeapscan.c:176
>>
>> Thank you for the report and the back trace. I think this might be a
>> manifestation of the problem I just described[1] on -hackers.
>> Depending on the shape of a multi-Gather query plan and therefore the
>> order of control flow, you might finish up using the DSA area that
>> belongs to a different Gather node and then find that it goes away too
>> soon. Investigating.
>
> I haven't managed to reproduce this, but I was coincidentally
> investigating a bug that appears to explain it. I think what happened
> is that a background worker was first to execute BitmapHeapNext and
> allocated a dsa_pointer, and then the leader process reached
> BitmapHeapNext and called tbm_attach_shared_iterate which tried to
> deference it, but it had es_query_dsa set to another gather node's DSA
> area (whichever Gather most recently ran ExecInitParallelPlan). That
> requires a certain order of execution and timing that I'm not sure how
> to reach. I have posted a patch that should fix it over here:
>
> https://www.postgresql.org/message-id/CAEepm%3D0Mv9BigJPpribGQhnHqVGYo2%2BkmzekGUVJJc9Y_ZVaYA%40mail.gmail.com
>
> Are you able to provide a minimal reproducer, an anonymised partial
> dump, or perhaps try out the patch on a copy of your database?
>
> --
> Thomas Munro
> http://www.enterprisedb.com

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Thomas Munro 2017-12-07 21:45:43 Re: BUG in 10.1 - dsa_area could not attach to a segment that has been freed
Previous Message Tom Lane 2017-12-07 18:18:47 Re: BUG #14948: cost overflow