Re: Query running for very long time (server hanged) with parallel append

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Query running for very long time (server hanged) with parallel append
Date: 2018-02-02 15:16:17
Message-ID: CA+TgmoZj-gXqbQD4E1Bg7yVB3bvcoSC5HWSv0DBwLNxBO5uFBQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Feb 2, 2018 at 1:43 AM, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com> wrote:
> On 2 February 2018 at 03:50, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> wrote:
>> Whatever logic bug might be causing the query to hang, it's not good
>> that we're unable to SIGINT/SIGTERM our way out of this state. See
>> also this other bug report for a known problem (already fixed but not
>> yet released), but which came with an extra complaint, as yet
>> unexplained, that the query couldn't be interrupted:
>>
>> https://www.postgresql.org/message-id/flat/151724453314.1238.409882538067070269%40wrigleys.postgresql.org
>
> Yeah, it is not good that there is no response to the SIGINT.
>
> The query is actually hanging because one of the workers is in a small
> loop where it iterates over the subplans searching for unfinished
> plans, and it never comes out of the loop (it's a bug which I am yet
> to fix). And it does not make sense to keep CHECK_FOR_INTERRUPTS in
> each iteration; it's a small loop that does not pass control to any
> other functions .

Uh, sounds like we'd better fix that bug.

> But I am not sure about this : while the workers are at it, why the
> backend that is waiting for the workers does not come out of the wait
> state with a SIGINT. I guess the same issue has been discussed in the
> mail thread that you pointed.

Is it getting stuck here?

/*
* We can't finish transaction commit or abort until all of the workers
* have exited. This means, in particular, that we can't respond to
* interrupts at this stage.
*/
HOLD_INTERRUPTS();
WaitForParallelWorkersToExit(pcxt);
RESUME_INTERRUPTS();

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2018-02-02 15:37:45 Re: Changing WAL Header to reduce contention during ReserveXLogInsertLocation()
Previous Message Robert Haas 2018-02-02 15:12:44 Re: [HACKERS] [PATCH] Lockable views