Re: Prepared statements and generic plans

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Prepared statements and generic plans
Date: 2016-06-03 03:44:24
Message-ID: CAKFQuwZCzOzWkXQOH5wE1=GuC66x=d54BDRQM0bdmVbLSz64EA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jun 2, 2016 at 9:56 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:

> In Postgres 9.2 we improved the logic of when generic plans are used by
> EXECUTE. We weren't sure how well it would work, and the docs included
> a vague description on when generic plans are chosen.
>
> I have gotten a few questions lately about how prepared statements are
> handled with multiple executions so I have updated the PREPARE manual
> page with the attached patch to more clearly explain generic plans and
> when they are chosen.
>
> I would like to apply this to the 9.6 docs.
>

​While putting the proposed patch in context I came across this.

​"""

Prepared statements have the largest performance advantage when a single
session is being used to execute a large number of similar statements. The
performance difference will be particularly significant if the statements
are complex to plan or rewrite, for example, if the query involves a join
of many tables or requires the application of several rules. If the
statement is relatively simple to plan and rewrite but relatively expensive
to execute, the performance advantage of prepared statements will be less
noticeable.
​"""

Until and unless the generic plan is used the "...if statements are
complex to plan..." doesn't make sense; and no where in the description
itself do we introduce the generic plan concept. This is inconsistent but
I'm not addressing it below though its worth considering before a patch to
this area is committed.

As to the patch...

Isn't this backwards? <note> [change]

"""
otherwise it <switching to a generic plan> occurs only after five or more
executions produce [execution /strike plans] plus planning costs that are,
on average, [roughly equal to /strike cheaper] than the generic plan.
"""

I'd maybe go with something like this:

All executions of a prepared statement having zero parameters will use the
same plan so the planning time taken during the first execution will be
spread across all subsequent executions. For statements having parameters
the first five executions will result in value-specific plans as previously
described. However, on the sixth execution a generic plan will also be
computed and if the average planning + execution cost of all previous
value-specific plans is about equal to the execution cost of the generic
plan the generic plan will be chosen for that and all subsequent executions.

<existing next paragraph>

If we are getting generic plans significantly cheaper than the
value-specific plans I suspect there is a problem...so any comparison that
indicates "less-than" is prone to cause confusion. The original is worded
well on this point: "...generic plan appears to be not much more
expensive..." but lacks detail elsewhere.

This part:

! A generic plan assumes each value supplied to
<command>EXECUTE</command>
! is one of the column's distinct values and that column values are
! uniformly distributed. For example, if statistics records three
! distinct column values, a generic plan assumes a column equality
! comparison will match 33% of processed rows. Column statistics
! also allows generic plans to accurately compute the selectivity of
! unique columns. Comparisons on non-uniformly-distributed columns and
! specification of non-existent values affects the average plan cost,
! and hence if and when a generic plan is chosen. [elided the last
sentence, placed in the first paragraph]

I'm not sure of the specific goal here but this level detail seems a bit
out-of-place in the SQL Command documentation. So, do we want this
user-facing and if so do we want it here?

So while looking to find a home for this detail I came across the fact that
Chapter 66 doesn't have a table of contents....quite possibly because it is
presently a single section chapter...

https://www.postgresql.org/docs/devel/static/planner-stats-details.html

It seems like this content would find a nice home there as well (though we
should fix any TOC bug before adding the new section...)

IMO there are now three possibilities for the existing paragraph:

Status-quo
First portion of Bruce's patch
Mine, above

I think the second part of Bruce's should be, possibly with additional
detail and maybe some concrete examples, added to Chapter 66 instead. A
pointer to that section in the SQL Command docs could be made - though I'm
not sure how much pointing into internals we do from the user-facing areas
of the docs. In any case the material would be available to those who wish
to see it.

The pertinent points, 5 value-specific executions and comparing the average
of all previous executions to the generic plan generated on the 6th
execution, both do seem worth upsetting the status-quo for. Beyond that it
is style - except I think Bruce got the comparison direction and magnitude
wrong as noted above.

This leaves Bruce's second alteration: which probably should follow the
rest over to chapter 66. The point of the existing sentence is to give the
casual user the means to detect the current type of plan and I think that
is all that is needed here.

David J.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-06-03 04:03:00 Re: [BUGS] BUG #14155: bloom index error with unlogged table
Previous Message Bruce Momjian 2016-06-03 02:05:22 Re: Prepared statements and generic plans