Re: On-demand running query plans using auto_explain and signals

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: "Shulgin, Oleksandr" <oleksandr(dot)shulgin(at)zalando(dot)de>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Stark <stark(at)mit(dot)edu>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: On-demand running query plans using auto_explain and signals
Date: 2015-09-03 20:06:59
Message-ID: CAFj8pRBWZbYpT-wRksbxcCUAHZ4+15JJDtq88heaxbXa-sPLiQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi

2015-09-03 18:30 GMT+02:00 Shulgin, Oleksandr <oleksandr(dot)shulgin(at)zalando(dot)de>
:

> On Wed, Sep 2, 2015 at 3:07 PM, Shulgin, Oleksandr <
> oleksandr(dot)shulgin(at)zalando(dot)de> wrote:
>
>> On Wed, Sep 2, 2015 at 3:04 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
>> wrote:
>>>
>>>
>>>> Well, maybe I'm missing something, but sh_mq_create() will just
>>>> overwrite the contents of the struct, so it doesn't care about
>>>> sender/receiver: only sh_mq_set_sender/receiver() do.
>>>>
>>>
>>> if you create sh_mq from scratch, then you can reuse structure.
>>>
>>
> Please find attached a v3.
>
> It uses a shared memory queue and also has the ability to capture plans
> nested deeply in the call stack. Not sure about using the executor hook,
> since this is not an extension...
>
> The LWLock is used around initializing/cleaning the shared struct and the
> message queue, the IO synchronization is handled by the message queue
> itself. After some testing with concurrent pgbench and intentionally deep
> recursive plpgsql functions (up to 700 plpgsql stack frames) I think this
> approach can work. Unless there's some theoretical problem I'm just not
> aware of. :-)
>
>
Comments welcome!
>

I am not pretty happy from this design. Only one EXPLAIN PID/GET STATUS in
one time can be executed per server - I remember lot of queries that
doesn't handle CANCEL well ~ doesn't handle interrupt well, and this can be
unfriendly. Cannot to say if it is good enough for first iteration. This is
functionality that can be used for diagnostic when you have overloaded
server and this risk looks too high (for me). The idea of receive slot can
to solve this risk well. The difference from this code should not be too
big - although it is not trivial - needs work with PGPROC.

Other smaller issues:

* probably sending line by line is useless - shm_mq_send can pass bigger
data when nowait = false
* pg_usleep(1000L); - it is related to single point resource

Some ideas:

* this code share some important parts with auto_explain (query stack) -
and because it should be in core (due handling signal if I remember well),
it can be first step of integration auto_explain to core.

> --
> Alex
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2015-09-03 20:07:19 Re: pg_ctl/pg_rewind tests vs. slow AIX buildfarm members
Previous Message Andrew Dunstan 2015-09-03 20:02:43 Re: pg_ctl/pg_rewind tests vs. slow AIX buildfarm members