[FEATURE PATCH] pg_stat_statements with plans

From: Julian Markwort <julian(dot)markwort(at)uni-muenster(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Cc: marius(dot)timmer(at)uni-muenster(dot)de, arne(dot)scheffer(at)uni-muenster(dot)de
Subject: [FEATURE PATCH] pg_stat_statements with plans
Date: 2017-01-25 17:34:49
Message-ID: 9e43fd8f-4d35-4b9d-545c-f9011cd4aa5d@uni-muenster.de
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

Hello psql-hackers!

TL:DR;
We extended the functionality of pg_stat_statements so it can track
worst and best case execution plans.

Based on a suggestion of my colleague Arne Scheffer, Marius Timmer and I
extended pg_stat_statements so it can also record execution plans,
whenever the execution time is exceeded (or deceeded) by a definable
factor.
We were largely inspired by the pg_stat_plans extension by Peter
Geoghegan and Simon Riggs - we don't claim any originality on this part
- which is unfortunately not available on newer postgresql versions.
There are a few differences which will become apparent in the following
lines.

By default, the modified pg_stat_statements extension will now track
good plans and bad plans for each entry in pg_stat_statements.
The plans are not normalized or hashed (as opposed to pg_stat_plans),
they represent discreet statements.
A good plan is saved, whenever this sort of query has been used for the
first time or the time of the previously recorded good plan has been
deceeded by a smaller factor than 0.9 .
Analogous to this, a bad_plan is saved, when the time has been exceeded
by a factor greater than 1.1 .
There are GUCs available so these parameters can be tuned to your
liking. Tracking can be disabled for both plans individually.
A plan_format can be defined to enable better readability or
processability through other tools.

You can reset your good and bad plans by using a
select on pg_stat_statements_good_plan_reset([queryid]);
resetting bad plans uses pg_stat_statements_bad_plan_reset, obviously.
In case of a reset, the execution time, timestamp and plan itself are
just set to 0 respective NULL.

The pg_stat_statements view now provides six extra columns:
good_plan, good_plan_time, good_plan_timestamp, bad_plan, bad_plan_time
and bad_plan_timestamp.

Plans are only displayed if the showtext argument is true and the user
is the superuser or the user who has been associated with that entry.

Furthermore, we implemented a GUC that allows you to control the maximum
refresh frequency to avoid performance impacts on restarts or resets.
A plan is only updated when tracking is enabled and more time than
"plan_min_interval" has passed (default: 5 seconds) and the previously
mentioned conditions for the execution time have been met.

The major selling point of this feature?
Beeing able to find plans that need optimization (e.g. by creating
indexes). As pg_stat_statements tracks normalized queries, there might
be certain values or even daytimes that result in very bad plans, while
others result in perfectly fine plans.
Of course, the GUC log_min_duration_statement can also detect long
runners, but the advantage of pg_stat_statements is that we count the
total calls of normalized queries, which enables us to find plans, that
don't count as long runners, while their aggregated time might show
shortcomings regarding their plans.

We've found this sort of tool really useful when dealing with queries
produced by ORM libraries, where optimization is not intuitive.

Various tests using pg_bench suggest that this extension does not worsen
the performance of the database.

We're really looking forward to your opinions and feedback on this
feature patch
Julian, Marius and Arne

Attachment Content-Type Size
pgss_plans_v01.patch text/x-patch 36.1 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2017-01-25 17:43:53 Re: [FEATURE PATCH] pg_stat_statements with plans
Previous Message Peter Eisentraut 2017-01-25 17:33:33 Re: Logical Replication WIP