Re: [HACKERS] SERIALIZABLE with parallel query

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Kevin Grittner <kgrittn(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] SERIALIZABLE with parallel query
Date: 2018-02-23 06:56:37
Message-ID: CAA4eK1+smn3Z7xmwr+6d=LW3vmDvk54rZUGy7b8H6Vc3RhwspQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Feb 23, 2018 at 8:48 AM, Thomas Munro
<thomas(dot)munro(at)enterprisedb(dot)com> wrote:
> On Fri, Feb 23, 2018 at 3:29 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>> On Thu, Feb 22, 2018 at 10:35 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>> On Thu, Feb 22, 2018 at 7:54 AM, Thomas Munro
>>>> PS I noticed that for BecomeLockGroupMember() we say "If we can't
>>>> join the lock group, the leader has gone away, so just exit quietly"
>>>> but for various other similar things we spew errors (most commonly
>>>> seen one being "ERROR: could not map dynamic shared memory segment").
>>>> Intentional?
>>>
>>> I suppose I thought that if we failed to map the dynamic shared memory
>>> segment, it might be down to any one of several causes; whereas if we
>>> fail to join the lock group, it must be because the leader has already
>>> exited. There might be a flaw in that thinking, though.
>>>
>>
>> By the way, in which case leader can exit early? As of now, we do
>> wait for workers to end both before the query is finished or in error
>> cases.
>
> create table foo as select generate_series(1, 10)::int a;
> alter table foo set (parallel_workers = 2);
> set parallel_setup_cost = 0;
> set parallel_tuple_cost = 0;
> select count(a / 0) from foo;
>
> That reliably gives me:
> ERROR: division by zero [from leader]
> ERROR: could not map dynamic shared memory segment [from workers]
>
> I thought this was coming from resource manager cleanup, but you're
> right: that happens after we wait for all workers to finish. Perhaps
> this is a race within DestroyParallelContext() itself: when it is
> called by AtEOXact_Parallel() during an abort, it asks the postmaster
> to SIGTERM the workers, then it immediately detaches from the DSM
> segment, and then it waits for the worker to start up.
>

I guess you mean to say worker waits to shutdown/exit. Why would it
wait for startup at that stage?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2018-02-23 07:02:32 Re: [HACKERS] SERIALIZABLE with parallel query
Previous Message Simon Riggs 2018-02-23 05:59:07 Re: committing inside cursor loop