Re: On-demand running query plans using auto_explain and signals

From: "Shulgin, Oleksandr" <oleksandr(dot)shulgin(at)zalando(dot)de>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Julien Rouhaud <julien(dot)rouhaud(at)dalibo(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: On-demand running query plans using auto_explain and signals
Date: 2015-12-17 09:45:32
Message-ID: CACACo5RBCSvtmh_apfqkyTSrgPBF8nPKbsb4nbe9-wvLa8N=0w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Dec 16, 2015 at 8:39 PM, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
wrote:

> Hi,
>
> On 12/01/2015 10:34 AM, Shulgin, Oleksandr wrote:
>
>>
>> I have the plans to make something from this on top of
>> pg_stat_statements and auto_explain, as I've mentioned last time. The
>> next iteration will be based on the two latest patches above, so it
>> still makes sense to review them.
>>
>> As for EXPLAIN ANALYZE support, it will require changes to core, but
>> this can be done separately.
>>
>
> I'm re-reading the thread, and I have to admit I'm utterly confused what
> is the current plan, what features it's supposed to provide and whether it
> will solve the use case I'm most interested in. Oleksandr, could you please
> post a summary explaining that?
>
> My use case for this functionality is debugging of long-running queries,
> particularly getting EXPLAIN ANALYZE for them. In such cases I either can't
> run the EXPLAIN ANALYZE manually because it will either run forever (just
> like the query) and may not be the same (e.g. due to recent ANALYZE). So we
> need to extract the data from the process executing the query.
>
> I'm not essentially opposed to doing this in an extension (better an
> extension than nothing), but I don't quite see how you could to do that
> from pg_stat_statements or auto_explain. AFAIK both extensions merely use
> hooks before/after the executor, and therefore can't do anything in between
> (while the query is actually running).
>
> Perhaps you don't intend to solve this particular use case? Which use
> cases are you aiming to solve, then? Could you explain?
>

Hi Tomas.

Thanks for your interest in this patch!

My motivation is the same as your use case: having a long-running query, be
able to look inside this exact query run by this exact backend.

I admit the evolution of ideas in this patch can be very confusing: we were
trying a number of different approaches, w/o thinking deeply on the
implications, just to have a proof of concept.

Maybe all we need to do is add another hook somewhere in the executor, and
> push the explain analyze into pg_stat_statements once in a while, entirely
> eliminating the need for inter-process communication (signals, queues, ...).
>
> I'm pretty sure we don't need this for "short" queries, because in those
> cases we have other means to get the explain analyze (e.g. running the
> query again or auto_explain). So I can imagine having a rather high
> threshold (say, 60 seconds or even more), and we'd only push the explain
> analyze after crossing it. And then we'd only update it once in a while -
> say, again every 60 seconds.
>
> Of course, this might be configurable by two GUCs:
>
> pg_stat_statements.explain_analyze_threshold = 60 # -1 is "off"
> pg_stat_statements.explain_analyze_refresh = 60
>
> FWIW I'd still prefer having "EXPLAIN ANALYZE" in core, but better this
> than nothing.
>

Yes, this is how pg_stat_statements idea comes into play: even if we
implement support for online EXPLAIN ANALYZE, enabling it for every query
is an overkill, but if you don't enable it from the query start, there is
no way to enable it later on as the query has already progressed. So in
order to know for which queries it makes sense to enable this feature, we
need to know what is the query's expected run time, thus pg_stat_statements
seems like a natural place to obtain this information and/or trigger the
behavior.

I'm also all for simplification of the underlying communication mechanism:
shared memory queues are neat, but not necessarily the best way to handle
it. As for the use of signals: I believe this was a red herring, it will
make the code much less fragile if the progressing backend itself will
publish intermediary EXPLAIN ANALYZE reports for other backends to read.

The EXPLAIN (w/o ANALYZE) we can do completely as an extension: no core
support required. To enable ANALYZE it will require a little hacking
around Instrumentation methods: otherwise the Explain functions just crash
when run in the middle of the query.

Hope that makes it clear.

--
Alex

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Shulgin, Oleksandr 2015-12-17 09:50:46 Re: WIP: bloom filter in Hash Joins with batches
Previous Message Thomas Munro 2015-12-17 09:04:41 Re: Making tab-complete.c easier to maintain