Re: first time hacker ;) messing with prepared statements

From: Joris Dobbelsteen <joris(at)familiedobbelsteen(dot)nl>
To: PFC <lists(at)peufeu(dot)com>
Cc: James Mansion <james(at)mansionfamily(dot)plus(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: first time hacker ;) messing with prepared statements
Date: 2008-03-30 19:23:58
Message-ID: 47EFE8CE.7030407@familiedobbelsteen.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

PFC wrote:
>> PFC wrote:
>>> Hello,
>>> So, I embarked (yesterday) on a weekend project to add a new
>>> feature to Postgres...
>>> I use PHP with persistent connections and always have been
>>> bothered that those very small AJAX queries (usually simple selects
>>> returning 1 row) take more CPU in postgres to parse & plan than to
>>> actually execute.
>> Microsoft's answer to this issue with SQLServer appears to have been
>> to introduce a smart
>> cache for all statement plans. It seems to be very effective. I guess
>> you're doing much the
>> same thing but with more user intervention, in effect.
> Actually, the main purpose was to 1) have fun hacking Postgres, and 2)
> perhaps something useful would come of it...
Nice job so far!
Since you have achieved (1), now its time to get to (2) and I thinks its
very well possible. In general its quite nice to have parts of your work
integrated.
> And I did find it very interesting, probably due to the fact that
> Postgres source code is so... impressively clean... well organized...
> readable... it took two hours from downloading the source to having an
> extra functional system catalog, and it worked at the first compile,
> all due to the concise but to the point comments in include/catalog, I
> couldn't believe it.
> Anyway, Microsoft's solution is cool, too, but you need to reparse
> the entire query to then detect "I've planned this query before, with
> other parameters, so I'll reuse that prepared plan", so it adds
> another parsing step, which is less efficient.
The MAJOR benefit of Microsoft's approach is that it works on existing
application, and, most importantly makes NO assumptions on the
"volatile" server state. A few cases where the Microsoft solution works,
while yours will fail is:

* Server restart and assorted like failover (you need to redo a
global prepare).
* Cleanup and instantiation of a prepared statement.

> Postgres could also do that with a small modification, by the way
> : like by using the entire string (with $1 style parameters) instead
> of the statement name, use that as a cache key, and send parameters
> separately, but I think it would be less clean than, say, a statement
> called "get_user_by_id" or something.
> Also I like the idea of named prepared queries, which feel a bit
> like procedures, because many of those small, often-used queries would
> end up being defined in the same place, which makes schema changes
> (and having to modify queries in your application) slightly less painful.
> PHP can also use pg_exec() which sends the parameters separately,
> automagically converted to postgres format, so you save time and
> hassle on quoting and de-quoting versus a SQL EXECUTE.
> Since the performance of pg_exec() is almost twice as fast as
> plain SQL, and PHP scripts tend to use quite a lot of these queries,
> it also means a free database server performance upgrade (ie. the same
> DB server can handle more webservers, for instance).
What you are doing for a global query cache is already in consideration
and having plan invalidation mechanism on schema changes or, maybe,
statistic updates was a step into that direction. You code mostly
contributed the other parts already.

I thinks a good middle ground to address the "persistence" problems I
think there are and keep close to your patch might be using "stored
procedures" and have these cached globally.

Another considerations is whether most task are getting CPU bound or IO
bound. A better, per query, plan might reduce IO load due to better use
of statistics on that single case, while for CPU bound it is very nice
to reduce the planning overhead significantly.
> Another possible implementation would be to use a connection pooler
> which, when opening a new connection, can be configured to send a SQL
> script containing all the PREPARE statements.
This is, IMHO, an application side feature that might be a good addition
to PHP and other languages that provide the "persistent connection" feature.

- Joris

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2008-03-30 19:37:51 Re: jaguar is failing
Previous Message Tom Lane 2008-03-30 19:09:20 Re: [PATCHES] psql slash# command