Re: Auto-explain patch

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Dean Rasheed <dean_rasheed(at)hotmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Auto-explain patch
Date: 2008-07-07 17:03:15
Message-ID: 1215450195.4051.610.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On Sun, 2008-07-06 at 17:58 +0000, Dean Rasheed wrote:

> OK, this one should work against CVS HEAD.

OK, still getting some offsets, but it applies.

The patch now outputs things in a useful way to avoid a flood of
information, that's good.

The code seems fine, but it doesn't link up with the other logging code
in postgres.c. That split also seems to drive the way the parameters
work, so you have another set of parameters mirroring the
log_min_duration_statement infrastructure. Fairly yuk, but I understand
why you've done this. It does however beg some serious questions about
the way we currently do logging.

The patch as it stands would produce EXPLAIN output for queries that
will never be logged, which isn't very useful. So Dean, right now I feel
this version of the patch must be rejected while we discuss how things
might work differently. Not an issue with the patch, so much as issues
with the code being patched.

So thinking about the issues we can either ignore them or tackle them:

Ignoring problem option: Rewrite this patch as an executor plugin, using
Itagaki Takahiro's new infrastructure. It's an option, but I guess Dave
will have something to say about that...

Fix problems: Having logging in two separate places is ugly and leads to
the problem of logging EXPLAINs and SQL completely separately, which
isn't very useful.

If we put all the logging code in the executor we miss out any parsing
and planning time but we can log statements and explains for SPI queries
also.

We could move the logging of the EXPLAIN to postgres.c, but then we'd
need to make querydescs live a little longer than desirable, or we
trigger the EXPLAIN code using some guess at what the planning overhead
was, a horrible kluge.

Neither way is clean. So how do we do it?

My proposal would be to introduce some new logging statements that work
similarly to, yet as an alternative to log_min_duration_statement.

* log_min_planner_duration - logs any statement whose planning time
exceeds the value in ms (-1 is off).
e.g. LOG planner time: %s ms statement: %s

* log_min_executor_duration - logs any execution whose execution time
exceeds the value in ms (-1 is off). We need to pass query text through
to the executor, depending upon whether its a portal's source text or an
incoming simple querystring. (This is similar, but not quite the same as
current patch's debug_explain_min_duration).
e.g. LOG executor time: %s ms statement: %s

* log_explain = on | off - logs EXPLAIN if either
log_min_executor_duration or log_min_planner_duration triggers. We
output the explain on the same line as the SQL statement, so we don't
need to play games to link up the output later. (This is similar to, but
not quite the same as current patch's debug_explain_plan).
e.g. LOG executor time: %s ms statement: %s

If both log_min_executor_duration and log_min_planner_duration trigger
then we get only one log line like this:
e.g. LOG planner time: %s ms executor time: %s ms statement: %s

We should call these parameters log_xxx not debug_xxx because we want
them for production logging.

If either of these new parameters is > -1 then
log_min_duration_statement does not trigger. (Or we simply make
log_min_duration_statement control the setting of the other two
parameters).

The log messages shown above allow us to differentiate between the
reasons the SQL statement has been logged, as well as providing the
details we want.

If we do this then:

* we can log EXPLAINs for any execution, plus we will always have the
SQL for any EXPLAIN

* we can log statements that take a long time to plan, yet a short time
to execute. That can be an overhead in some cases. It also neatly
sidelines the issue of where logging takes place. (We need to be able to
track planner time, if we think we might want to introduce options to
control planner effort, topic discussed on other recent thread, not
here)

* we can log SQL for any execution, even those via SPI. That means we'll
be able to log the SQL used in functions.

We'll probably need an option to log SQL from SPI or not. Things like RI
trigger code probably doesn't need to be logged, especially if the top
level statement is, like ALTER TABLE.

This alters current performance logging, so I expect discussion.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David E. Wheeler 2008-07-07 17:09:44 Re: PATCH: CITEXT 2.0
Previous Message Stephen Frost 2008-07-07 16:31:58 Re: [PATCHES] column level privileges