Re: On-demand running query plans using auto_explain and signals

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: "Shulgin, Oleksandr" <oleksandr(dot)shulgin(at)zalando(dot)de>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: On-demand running query plans using auto_explain and signals
Date: 2015-09-29 20:59:20
Message-ID: CANP8+jKWGXRwRNPUUHwW+cf4dx3xfNXxx5RKPtzpBvXkjZenvQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 29 September 2015 at 20:52, Shulgin, Oleksandr <
oleksandr(dot)shulgin(at)zalando(dot)de> wrote:

> On Tue, Sep 29, 2015 at 8:34 PM, Simon Riggs <simon(at)2ndquadrant(dot)com>
> wrote:
>
>> On 29 September 2015 at 12:52, Shulgin, Oleksandr <
>> oleksandr(dot)shulgin(at)zalando(dot)de> wrote:
>>
>>
>>> Hitting a process with a signal and hoping it will produce a meaningful
>>> response in all circumstances without disrupting its current task was way
>>> too naive.
>>>
>>
>> Hmm, I would have to disagree, sorry. For me the problem was dynamically
>> allocating everything at the time the signal is received and getting into
>> problems when that caused errors.
>>
>
> What I mean is that we need to move the actual EXPLAIN run out of
> ProcessInterrupts(). It can be still fine to trigger the communication
> with a signal.
>

Good

> * INIT - Allocate N areas of memory for use by queries, which can be
>> expanded/contracted as needed. Keep a freelist of structures.
>> * OBSERVER - When requested, gain exclusive access to a diagnostic area,
>> then allocate the designated process to that area, then send a signal
>> * QUERY - When signal received dump an EXPLAIN ANALYZE to the allocated
>> diagnostic area, (set flag to show complete, set latch on observer)
>> * OBSERVER - process data in diagnostic area and then release area for
>> use by next observation
>>
>> If the EXPLAIN ANALYZE doesn't fit into the diagnostic chunk, LOG it as a
>> problem and copy data only up to the size defined. Any other ERRORs that
>> are caused by this process cause it to fail normally.
>>
>
> Do you envision problems if we do this with a newly allocated DSM every
> time instead of pre-allocated area? This will have to revert the workflow,
> because only the QUERY knows the required segment size:
>

That's too fiddly; we need to keep it simple by using just fixed sizes.

> OBSERVER - sends a signal and waits for its proc latch to be set
> QUERY - when signal is received allocates a DSM just big enough to fit the
> EXPLAIN plan, then locates the OBSERVER(s) and sets its latch (or their
> latches)
>
> The EXPLAIN plan should already be produced somewhere in the executor, to
> avoid calling into explain.c from ProcessInterrupts().
>
> That allows the observer to be another backend, or it allows the query
>> process to perform self-observation based upon a timeout (e.g. >1 hour) or
>> a row limit (e.g. when an optimizer estimate is seen to be badly wrong).
>>
>
> Do you think there is one single best place in the executor code where
> such a check could be added? I have very little idea about that.
>

Fairly simple.

Main problem is knowing how to handle nested calls to the executor.

I'll look at the patch.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2015-09-29 21:13:33 Re: Idea for improving buildfarm robustness
Previous Message Peter Geoghegan 2015-09-29 20:56:37 Re: Less than ideal error reporting in pg_stat_statements