Re: On-demand running query plans using auto_explain and signals

From: "Shulgin, Oleksandr" <oleksandr(dot)shulgin(at)zalando(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: On-demand running query plans using auto_explain and signals
Date: 2015-09-25 17:13:13
Message-ID: CACACo5StM7ikEEJi64B49k_zpUMXWVhdDKKyiHBS5tv+5S4VfA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 18, 2015 at 7:04 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

>
> Frankly, I think you guys are making this out to be way more
> complicated than it really is. Basically, I think the process being
> queried should publish a DSM via a slot it owns. The recipient is
> responsible for reading it and then notifying the sender. If a second
> process requests data before the first process reads its data, the
> second process can either (1) become an additional reader of the
> already-published data or (2) wait for the first process to finish,
> and then send its own inquiry.
>
> There are some problems to solve here, but they hardly seem impossible.

Thank you Robert, for your invaluable input on this patch!

I now believe that use of ProcessInterrupts() in the recently proposed
design as well as manipulation of proc latch due to use of shared memory
queue are major blockers.

In order to simplify things up, I've taken a step back and looked again at
the auto_explain contrib module. I now propose the most simple thing
possible in my opinion: a new GUC option for auto_explain. It will make
every backend, in which the module is loaded via *_preload_libraries
mechanism or using CREATE EXTENSION/LOAD commands, to actively publish the
plans of queries in dynamic shared memory as long as
auto_explain.publish_plans is turned on.

The greatest part for me, is that this approach doesn't require handling of
signals and is isolated in an external module, so it can be readily used
with the current server versions, no need to wait for >= 9.6.

Some implementation details:

For every backend that might be running (MaxBackends) we reserve a
dsm_handle slot in the addins shared memory. When the new option is turned
on, the ExecutorRun hook will produce a plan in whatever format is
specified by the auto_explain.log_format, allocate a DSM segment, copy the
plan into the segment and finally store the DSM handle into its own slot.
No locking is required around this because every backend writes to its slot
exclusively, no other backend can be writing into it concurrently. In the
ExecutorFinish hook we invalidate the backend's slot by setting the handle
to 0 and deallocate the DSM segment.

Reading of the plan is performed by a newly added function
pg_explain_backend(PID). Since it can determine the target process'
backendId, it simply reads the DSM handle from that backend's slot and
tries to attach it (there's not much point in checking the handle for being
non-0, because the other backend could release the segment the moment after
we've checked it, so we rely on dsm_attach returning non-NULL). If
attached successfully, we parse the contents and detach. At this point the
backend to detach the last is actually releasing the segment, due to
reference counting.

Handling of the nested statements plans is an open question. It can be
really useful when the top-level plan is simply displaying a "SELECT
my_stored_procedure()" and all the interesting stuff is happening behind
the scenes, but I didn't start to think about how this could be implemented
yet.

Pavel was really interested in retrieving the complete query text/plans
which could be over a megabyte in his case (and pg_stat_activity.query is
capped by 10240 bytes I believe). This is now possible with the patch, but
some others might still want to put a threshold on the allocation,
especially given this is shared memory. I can envision another GUC, but in
our experience the extremely long queries (and plans) are most of the time
due to use of VALUES() or IN() clauses with a huge list of literals.

I think we could fold the VALUES/IN into "?" if the query/plan text exceeds
the specified threshold, or unconditionally (yes, pg_stat_statements, I'm
looking at you). This should help in the cases when the most interesting
part is in the plan nodes near the end, but there's such a huge list of
literals before it.

Future plans:

I believe this approach can be extended to enable instrumentation once
again. The backend executing the query could update the published plan
every once in a while (for example, every N ms or 1% of rows processed in a
node), and any other process interested in this data, can simply read it
without the need for signals and complex and fragile communication. This
obviously requires a core patch.

Some problems:

There is a potential problem with the limited total number of DSM segments:
it is reserved in a way to only allow 2 per backend (on average) and 64
additional per server, so if you run with the new option enabled at all
times, you're down to only 1 additional DSM per backend (again, on
average). Not sure how critical this can be, but no one is forced to run
with this option enabled for all backends.

If you don't want to run it enabled at all times, then enabling the GUC
per-backend can be problematic. It's still possible to update the conf
file and send SIGHUP to a single backend, but it's harder to accomplish
over psql, for example. I think here we might still have some luck with
the signals: use another array of per-backend slots with flags, set the
target backend's flag and send it SIGUSR1. The backend wakes on the signal
and examines its slot, then toggles the GUC if needed. Sounds pretty safe,
eh?

No documentation changes yet, waiting for your comments. :-)

Happy hacking!
--
Alex

Attachment Content-Type Size
0001-Add-auto_explain.publish_plans.patch text/x-patch 14.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joshua D. Drake 2015-09-25 17:13:57 Re: No Issue Tracker - Say it Ain't So!
Previous Message Simon Riggs 2015-09-25 17:11:12 Re: No Issue Tracker - Say it Ain't So!