Transient plans versus the SPI API

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: Transient plans versus the SPI API
Date: 2011-08-02 20:47:18
Message-ID: 29216.1312318038@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I've been thinking about how to redesign the plancache infrastructure to
better support use of transient (one-shot) plans, as we've talked about
various times such as in this thread:
http://archives.postgresql.org/pgsql-hackers/2010-02/msg00607.php
(Note: that thread sorta went off into the weeds arguing about exactly
what heuristics to use for when to re-plan. I don't want to re-open that
issue today, since there's no way to experiment with policy until we have
some mechanism in place.)

I think that what we need to do is get rid of the assumption that a "cached
plan" normally includes a plan per se. The initial creation of the cache
entry should just supply a raw query plus its analyzed-and-rewritten form.
(plancache.c can actually operate that way today, via its "not fully
planned" flag, but it's a wart rather than the normal philosophy.) Then
RevalidateCachedPlan should be replaced by something with the semantics
of "get me a plan to use, and here's the parameter values I'm going to use
it with". The choice between using a pre-cached generic plan and building
a one-off plan would then be localized in this new function.

There are not that many places that call plancache.c directly, and so this
change in API won't cause much code churn --- but one place that does
depend on this is spi.c, and there is *lots* of both core and third-party
code that calls SPI_prepare for example. So we need to tread carefully in
redefining SPI's behavior.

The most straightforward way to reimplement things within spi.c would be
to redefine SPI_prepare as just doing the parse-and-rewrite steps, with
planning always postponed to SPI_execute. In the case where you just
prepare and then execute a SPIPlan, this would come out the same or
better, since we'd still just do one planning cycle, but the planner could
be given the actual parameter values to use. However, if you SPI_prepare,
SPI_saveplan, and then SPI_execute many times, you might come out behind.
This is of course the same tradeoff we are going to impose at the SQL level
anyway, but I wonder whether there needs to be a control knob available to
C code to retain the old plan-once-and-always-use-that-plan approach.

Anyone have an opinion about that? And if we do need to expose some
control, should the default (if you don't change your source code) be that
you still get the old behavior, or that you get the new behavior? I'm
inclined to think that if we believe this'll be a win at the SQL level,
it should be a win at the SPI-caller level too, but maybe someone thinks
otherwise.

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2011-08-02 21:03:21 Re: Transient plans versus the SPI API
Previous Message Kevin Grittner 2011-08-02 18:46:44 Re: WIP fix proposal for bug #6123