Skip site navigation (1) Skip section navigation (2)

Cached Query Plans (was: global prepared statements)

From: PFC <lists(at)peufeu(dot)com>
To: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Cached Query Plans (was: global prepared statements)
Date: 2008-04-11 16:34:41
Message-ID: op.t9ggb3wacigqcu@apollo13.peufeu.com (view raw or flat)
Thread:
Lists: pgsql-hackers
	Well, I realized the idea of global prepared statements actually sucked,  
so I set on another approach thanks to ideas from this list, this is  
caching query plans.

First, let's see if there is low hanging fruit with the typical small,  
often-executed queries that are so frequent on websites.
Tables test, test2 and test3 contain id (integer primary key) and another  
integer field. There are 100K rows in each.

First, the simplest query :
SELECT * FROM test WHERE id = $1

110 us : Send query as text (PHP:pg_query -> PQexec)
125 us : Parse+Bind (PHP:pg_query_params -> PQexecParams)
  67 us : Execute a previously prepared statement (PHP:pg_execute ->  
PQexecPrepared)

A slightly more complex one but still pretty classic :
SELECT * FROM (SELECT * FROM test WHERE id>$1 ORDER BY id LIMIT 5) AS a  
NATURAL LEFT JOIN test2 NATURAL LEFT JOIN test3 ORDER BY id

523 us : Send query as text (PHP:pg_query -> PQexec)
580 us : Parse+Bind (PHP:pg_query_params -> PQexecParams)
148 us : Execute a previously prepared statement (PHP:pg_execute ->  
PQexecPrepared)

OK, so there is low hanging fruit since the parsing+planning time of those  
is longer than doing the query itself.

Since the Parse message includes a $-parameterized query that is to be  
prepared, it seems logical to put the caching logic there : the query  
string (without parameters) makes a nice cache key.

So I made a really quick and really dirty experimentation without changing  
the wire protocol between client and server. This is only "proof of  
concept".

Try #1 : in exec_parse_message(), if the statement is named, look it up in  
the prepared statements cache, if it is found, return at once and do  
nothing else.
To exploit this, I issue a pg_prepare() followed by pg_execute() at every  
query, wether or not the statement exists. If it already exists,  
pg_prepare() now does nothing (except losing a little time).

Results :
  88 us : simple query
173 us : complex query

So, the timings are between a simple execute and a plan+execute. It  
provides a nice performance gain versus replanning every time, but not  
perfect.

Try #2 : again, in exec_parse_message(), if the statement is unnamed, I  
use the query string as the statement name, search the plan in the  
prepared statements hash table. If it is not found, then it is prepared.  
Then I make the unnamed statement plan point to this. Of course, this is  
dangerous since it probably introduces a dozen crash bugs, but for this  
proof of concept, it's OK.
Client code is unchanged, PQexecParams will benefit from the plan caching,  
since it always sends a Parse+Bind message using the unnamed statement.

Results are identical to executing an execute on a prepared statement,  
modulo a few microseconds.
This means the overhead of sending the Parse message, and of the server  
ignoring it when the statement is cached, is negligible.

	So, where to go from that ? I don't see a way to implement this without a  
(backwards-compatible) change to the wire protocol, because the clients  
will want to specify when a plan should be cached or not. Since the user  
should not have to name each and every one of the statements they want to  
use plan caching, I see the following choices :

	- Add a new Parse+Bind command, which gets the $-parameterized SQL and  
the parameters. If the plan is cached, grab it and execute, else prepare  
and execute. Add a flag to allow the client to specify if he wants caching  
or not.
	Pros : Only one message, faster
	Cons : SQL is transmitted in full, useless most of the time, but this  
overhead is rather small.

	- Send the SQL with Bind as statement name, add a flag to Bind telling it  
to report a cache miss instead of raising an error, then have the client  
send a Parse and Bind again.

	- Should there be one common hashtable for named prepared statements and  
cached plans, or two hashtables ? Using the SQL string as the statement  
name is not clean.


























Responses

pgsql-hackers by date

Next:From: Tom DunstanDate: 2008-04-11 16:35:26
Subject: Re: Commit fest queue
Previous:From: Tom LaneDate: 2008-04-11 16:23:06
Subject: Re: Index AM change proposals, redux

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group