explain plans with information about (modified) gucs

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: explain plans with information about (modified) gucs
Date: 2018-12-14 11:41:36
Message-ID: e1791b4c-df9c-be02-edc5-7c8874944be0@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

Hi,

every now and then I have to investigate an execution plan that is
strange in some way and I can't reproduce the same behavior. Usually
it's simply due to data distribution changing since the problem was
observed (say, after a nightly batch load/update).

In many cases it however may be due to some local GUC tweaks, usually
addressing some query specific issues (say, disabling nested loops or
lowering join_collapse_limit). I've repeatedly ran into cases where the
GUC was not properly reset to the "regular" value, and it's rather
difficult to identify this is what's happening. Or cases with different
per-user settings and connection pooling (SET SESSION AUTHORIZATION /
ROLE etc.).

So I propose to extend EXPLAIN output with an additional option, which
would include information about modified GUCs in the execution plan
(disabled by default, of course):

test=# explain (gucs) select * from t;

QUERY PLAN
--------------------------------------------------------------------
Seq Scan on t (cost=0.00..35.50 rows=2550 width=4)
GUCs: application_name = 'x', client_encoding = 'UTF8',
cpu_tuple_cost = '0.01'
(2 rows)

Of course, this directly applies to auto_explain too, which gets a new
option log_gucs.

The patch is quite trivial, but there are about three open questions:

1) names of the options

I'm not particularly happy with calling the option "gucs" - it's an
acronym and many users have little idea what GUC stands for. So I think
a better name would be desirable, but I'm not sure what would that be.
Options? Parameters?

2) format of output

At this point the names/values are simply formatted into a one-line
string. That's not particularly readable, and it's not very useful for
the YAML/JSON formats I guess. So adding each modified GUC as an extra
text property would be better.

3) identifying modified (and interesting) GUCs

We certainly don't want to include all GUCs, so the question is how to
decide which GUCs are interesting. The simplest approach would be to
look for GUCs that changed in the session (source == PGC_S_SESSION), but
that does not help with SET SESSION AUTHORIZATION / ROLE cases. So we
probably want (source > PGC_S_ARGV), but probably not PGC_S_OVERRIDE
because that includes irrelevant options like wal_buffers etc.

For now I've used

/* return only options that were modified (not as in config file) */
if ((conf->source <= PGC_S_ARGV) || (conf->source == PGC_S_OVERRIDE))
continue;

which generally does the right thing, although it also includes stuff
like application_name or client_encoding. But perhaps it'd be better to
whitelist the GUCs in some way, because some of the user-defined GUCs
may be sensitive and should not be included in plans.

Opinions?

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
explain-with-gucs.diff text/x-patch 6.2 KB
pEpkey.asc application/pgp-keys 8.6 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Suraj Kharage 2018-12-14 11:51:49 Catalog views failed to show partitioned table information.
Previous Message Amit Kapila 2018-12-14 11:13:38 Re: Hitting CheckRelationLockedByMe() ASSERT with force_generic_plan