Re: [PATCH v2] Progress command to monitor progression of long running SQL queries

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Remi Colinet <remi(dot)colinet(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH v2] Progress command to monitor progression of long running SQL queries
Date: 2017-05-20 11:42:55
Message-ID: CAA4eK1JhHbMPh+m5R9jyKU+pQEENBzGn6pnUVKjAfPeoqopZNg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, May 17, 2017 at 9:43 PM, Remi Colinet <remi(dot)colinet(at)gmail(dot)com> wrote:
>
> 2017-05-13 14:38 GMT+02:00 Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>:
>>
>> On Wed, May 10, 2017 at 10:10 PM, Remi Colinet <remi(dot)colinet(at)gmail(dot)com>
>> wrote:
>> >
>> > Parallel queries can also be monitored. The same mecanism is used to
>> > monitor
>> > child workers with a slight difference: the main worker requests the
>> > child
>> > progression directly in order to dump the whole progress tree in shared
>> > memory.
>> >
>>
>> What if there is any error in the worker (like "out of memory") while
>> gathering the statistics? It seems both for workers as well as for
>> the main backend it will just error out. I am not sure if it is a
>> good idea to error out the backend or parallel worker as it will just
>> end the query execution.
>
>
> The handling of progress report starts by the creation of a MemoryContext
> attached to CurrentMemoryContext. Then, few memory (few KB) is allocated.
> Meanwhile, the handling of progress report could indeed exhaust memory and
> fail the backend request. But, in such situation, the backend could also
> have fail even without any progress request.
>
>>
>> Also, even if it is okay, there doesn't seem
>> to be a way by which a parallel worker can communicate the error back
>> to master backend, rather it will just exit silently which is not
>> right.
>
>
> If a child worker fails, it will not respond to the main backend request.
> The main backend will follow up it execution after a 5 seconds timeout (GUC
> param to be added may be). In which case, the report would be partially
> filled.
>
> If the main backend fails, the requesting backend will have a response such
> as:
>
> test=# PROGRESS 14611;
> PLAN PROGRESS
> ----------------
> <backend timeout>
> (1 row)
>
> test=#
>
> and the child workers will log their response to the shared memory. This
> response will not be collected by the main backend which has failed.
>

If the worker errors out due to any reason, we should end the main
query as well, otherwise, it can produce wrong results. See the error
handling of workers in HandleParallelMessage

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2017-05-20 12:19:10 Making replication commands case-insensitive
Previous Message Amit Kapila 2017-05-20 11:33:05 Re: statement_timeout is not working as expected with postgres_fdw