On-demand running query plans using auto_explain and signals

From: "Shulgin, Oleksandr" <oleksandr(dot)shulgin(at)zalando(dot)de>
To: Pavel Stěhule <pavel(dot)stehule(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Greg S <stark(at)mit(dot)edu>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: On-demand running query plans using auto_explain and signals
Date: 2015-08-29 15:33:22
Message-ID: CACACo5Sz7G0MFauC082iM=XX_hQ7qQ5ndR4JPo+H-O5vp6iCcQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

The other day I've rediscovered the exciting idea of using signals to dump
a backend's currently executed query plan, which, it turns out, was already
proposed by Pavel and Simon in these threads:

http://www.postgresql.org/message-id/flat/CAFj8pRA-DuzkmDtu52CiUgb0P7TVri_B8LtjMJfWcnr1LPts6w(at)mail(dot)gmail(dot)com

http://www.postgresql.org/message-id/flat/CAFj8pRDEo24joEg4UFRDYeFADFTw-jw_=t=kPwOyDW=v=g1Fhg(at)mail(dot)gmail(dot)com#

Unfortunately, the latest one is missing an actual patch (or the attachment
was scrubbed) and I'm really surprised that the idea didn't take off then.

While evaluating it myself I've decided to simply patch auto_explain module
which is now in contrib, so presumably should be available to a broad
audience. Proof-of-concept patch against master is attached (a patch for
an older branch like 9.0 requires trivial manual effort to adapt).

While I can see the value of in-core support for this, realistically this
is unlikely to be included in 9.5, but an ad hoc patch could be useful long
before that happens.

What this patch does is simply dump the plan of the query being run by the
executor to the log when signaled with SIGUSR1. The auto_explain module
must be loaded to the backend beforehand of course,
session_preload_libraries seems like the place to do that cluster-wide.

Probably using SIGUSR2 would be more appropriate, but I'm not sure if there
are other extensions out there that might be already using it for some
other reason (well, I do not know that for SIGUSR1 either). Looking at the
current state of affairs in procsignal_sigusr1_handler() makes me believe
it should be pretty safe to catch the signal like I do. Or is that not the
case?

The current_query_desc probably needs to be a stack-like structure in order
to keep track of the nested queries correctly, but it works in the simplest
cases.

What would be even more useful is including stats from the running query in
the explain output, if you're a willing to pay for a (hopefully small)
overhead. Unfortunately, that doesn't work out of the box: if you enable
the auto_explain.log_analyze and friends in the .conf-file, you either get
all zero counts, or if you're really unlucky, an error from InstrEndLoop():

ERROR: InstrEndLoop called on running node

The latest patch in this area I could found is this one:
http://www.postgresql.org/message-id/87wsn82lda.fsf@oxford.xeocode.com

From what I can see, there's no way around this problem short of hacking
InstrEndLoop... Did anything change in this area since 2008 possibly? I
would really love to have a way to make this work with existing un-patched
servers.

Cheers!
--
Alex

Attachment Content-Type Size
explain-on-signal.patch text/x-patch 5.2 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2015-08-29 15:40:54 Re: 9.4 broken on alpha
Previous Message David Fetter 2015-08-29 15:32:29 Re: 9.4 broken on alpha