Re: On-demand running query plans using auto_explain and signals

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: On-demand running query plans using auto_explain and signals
Date: 2015-09-12 18:27:52
Message-ID: CAFj8pRBJEYB3QRT_uhkYdzfyddyUXE14P1Sepp0gQVOermkzWw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi

Thank you for precious check.

2015-09-12 11:50 GMT+02:00 Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>:

> Hi,
>
> I did a quick initial review of this patch today, so here are my comments
> so far:
>
> - ipcs.c should include utils/cmdstatus.h (the compiler complains
> about implicit declaration of two functions)
>
> - Attempts to get plan for simple insert queries like this
>
> INSERT INTO x SELECT * FROM x;
>
> end with a segfault, because ActivePortal->queryDesc is 0x0 for this
> query. Needs investigation.
>
> - The lockless approach seems fine to me, although I think the fear
> of performance issues is a bit moot (I don't think we expect large
> number of processes calling pg_cmdstatus at the same time). But
> it's not significantly more complex, so why not.
>
> - The patch contains pretty much no documentation, both comments
> at the code level and user docs. The lack of user docs is not that
> a big deal at this point (although the patch seems to be mature
> enough, although the user-level API will likely change).
>
> The lack of code comments is more serious, as it makes the review
> somewhat more difficult. For example it'd be very nice to document
> the contract for the lock-less interface.
>
> - I agree that pg_cmdstatus() is not the best API. Having something
> like EXPLAIN PID would be nice, but it does not really work for
> all the request types (just CMD_STATUS_REQUEST_EXPLAIN). Maybe
> there's not a single API for all cases, i.e. we should use EXPLAIN
> PID for one case and invent something different for the other?
>

I used this simple interface because it is faster way how to get a
prototype. In this patch the most complexity is in interprocess
communication and in executor hooking. The UI can be designed later very
quickly.

There is consensus about EXPLAIN PID, for other (status, query) I don't
want to introduce new keyword, so there will be accessed via functions. But
we can introduce psql commands

\qpid - query pid and \spid - status pid

> - Is there a particular reason why we allocate slots for auxiliary
> processes and not just for backends (NumBackends)? Do we expect those
> auxiliary processes to ever use this API?
>

The introduced interprocess communication (we talked about this possibility
week ago) can be used by any diagnostic extension - so aux processes can be
interesting too.

>
> - CleanupCmdStatusSlot seems needlessly complicated. I don't quite see
> the need for the second argument, or the internal slot variable. Why
> not to simply use the MyCmdStatusSlot directly?
>
> - I also don't quite understand why we need to track css_pid for the
> slot? In what scenario will this actually matter?
>
>
I have to recheck it - but this is safeguard against to change process with
same backendId.

> - While being able to get EXPLAIN from the running process is nice,
> I'm especially interested in getting EXPLAIN ANALYZE to get insight
> into the progress of the execution. The are other ways to get the
> EXPLAIN, e.g. by opening a different connection and actually running
> it (sure, the plan might have changed since then), but currently
> there's no way to get insight into the progress.
>

It can be pretty nice feature. I though about it - as next step of this
feature. But I have not idea how it is complex task - maybe not too much.

>
> From the thread I get the impression that Oleksandr also finds this
> useful - correct? What are the plans in this direction?
>
> ISTM we need at least two things for that to work:
>
> (a) Ability to enable instrumentation on all queries (effectively
> what auto_explain allows), otherwise we can't get EXPLAIN ANALYZE
> on the queries later. But auto_explain is an extension, so that
> does not seem as a good match if this is supposed to be in core.
> In that case a separate GUC seems appropriate.
>

If described functionality will be implemented, then auto_explain extension
will be very thin envelop - I don't see any reason, why it should not be
integrated to core.

>
> (b) Being able to run the InstrEnd* methods repeatedly - the initial
> message in this thread mentions issues with InstrEndLoop for
> example. So perhaps this is non-trivial.
>
> - And finally, I think we should really support all existing EXPLAIN
> formats, not just text. We need to support the other formats (yaml,
> json, xml) if we want to use the EXPLAIN PID approach, and it also
> makes the plans easier to process by additional tools.
>

surely - there is not any reason why not.

Regards

Pavel

>
>
> regards
>
> --
> Tomas Vondra http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2015-09-12 20:19:41 Re: Reducing the size of BufferTag & remodeling forks
Previous Message Tom Lane 2015-09-12 15:51:30 Re: Double linking MemoryContext children