Re: [WIP] Patches to enable extraction state of query execution from external session

From: Maksim Milyutin <m(dot)milyutin(at)postgrespro(dot)ru>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [WIP] Patches to enable extraction state of query execution from external session
Date: 2016-08-31 13:09:15
Message-ID: 488c2544-4359-e04c-bb32-96ab04035b4e@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


> On 2016-08-30 11:22:43 +0300, Maksim Milyutin wrote:
>>> Hi,
>>>
>>> On 2016-08-29 18:22:56 +0300, maksim wrote:
>>>> Now I complete extension that provides facility to see the current state of
>>>> query execution working on external session in form of EXPLAIN ANALYZE
>>>> output. This extension works on 9.5 version, for 9.6 and later it doesn't
>>>> support detailed statistics for parallel nodes yet.
>>>
>>> Could you expand a bit on what you want this for exactly?
>>
>> Max goal - to push my extension to postgres core. But now it's ready only
>> for 9.5. Prerequisites of this extension are patches presented here.
>
> I'm asking what you want this for. "An extension" isn't a detailed
> description...
>

I want to provide the facility to fetch state of query on some other
backend running on the same server. In essence, it's going to be a
microlevel monitoring tool. A typical use case looks like this:

1) assume 1st backend executes a simple query:
select * from foo join bar on foo.c1=bar.c1
2) somebody tries to fetch state of that backend, so he addresses it
through pid:
select * from pg_query_state(pid := <1st_backend_pid>)
3) he'll get detailed description of state - something like this:

Hash Join (Current loop: actual rows=0, loop number=1)
Hash Cond: (foo.c1 = bar.c1)
-> Seq Scan on foo (Current loop: actual rows=1, loop number=1)
-> Hash (Current loop: actual rows=0, loop number=1)
Buckets: 131072 Batches: 8 Memory Usage: 1kB
-> Seq Scan on bar (Current loop: actual rows=49, loop
number=1)

Note that I've added *Current loop* records with mumber of emitted rows
(*actual rows*) and *loop number* attached to each node. We could also
add a timing info.

For parallel nodes I want to print statistics for each worker separately
(it's not finished yet).

You could also watch my screencast (it's short enough) to get the idea:
https://asciinema.org/a/981bed2lu7r8sx60u5lsjei30

>
>>>> 2. Patch that enables to interrupt the query executor
>>>> (executor_hooks.patch).
>>>> This patch enables to hang up hooks on executor function of each node
>>>> (ExecProcNode). I define hooks before any node execution and after
>>>> execution.
>>>> I use this patch to add possibility of query tracing by emitted rows from
>>>> any node. I interrupt query executor after any node delivers one or zero
>>>> rows to upper node. And after execution of specific number trace steps I can
>>>> get the query state of traceable backend which will be somewhat
>>>> deterministic. I use this possibility for regression tests of my extension.
>>>
>>> This will increase executor overhead.
>>
>> In simple case we have checks on existence of hooks.
>
> That *is* noticeable.
>

Then I'll really consider the case with hiding hook checking inside the
"if (instrument)" statement, thanks!

>>> I think we'll need to find a way
>>> to hide this behind the existing if (instrument) branches.
>>
>> And so can be. It doesn't matter for trace mode. But I think instrument
>> branch is intended only for collecting statistics by nodes.
>
> I can't follow here. That's all what analyze is about?
>

I meant that hiding hooks is not universal solution. If 'instrument'
variable is empty (e.g. query without analyze) hooks become disabled.
But in my case 'instrument' is initialized anyway and I don't care about it.

>> 3. Patch that enables to output runtime explain statistics
>> (runtime_explain.patch).
>> This patch extends the regular explain functionality. The problem
is in the
>> point that regular explain call makes result output after query
execution
>> performing InstrEndLoop on nodes where necessary. My patch introduces
>> specific flag *runtime* that indicates whether we explain running
query and
>> does some insertions in source code dedicated to output the
statistics of
>> running query.
>
> Unless I'm missing something this doesn't really expose a user of this
> functionality?
>

Probably I could exclude *runtime_explain.patch* from the Postgres core
through copying *explain.c* to module's directory and its further
customization for my purposes. But in that case I'd have to maintain
'local explain', fixing bugs and coping with other issues from time to
time (i.e. in case of major upgrade).

--
Maksim Milyutin
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Joel Jacobson 2016-08-31 13:09:21 Re: autonomous transactions
Previous Message Heikki Linnakangas 2016-08-31 13:03:38 pgsql: Use static inline functions for float <-> Datum conversions.