Re: Transient plans versus the SPI API

From: Anssi Kääriäinen <anssi(dot)kaariainen(at)thl(dot)fi>
To: Hannu Krosing <hannu(at)krosing(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Transient plans versus the SPI API
Date: 2011-08-08 08:39:55
Message-ID: 4E3FA0DB.1010107@thl.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 08/07/2011 12:25 PM, Hannu Krosing wrote:
> On Sun, 2011-08-07 at 11:15 +0200, Hannu Krosing wrote:
>> On Wed, 2011-08-03 at 15:19 -0400, Tom Lane wrote:
>>> Hm, you mean reverse-engineering the parameterization of the query?
>> Yes, basically re-generate the query after (or while) parsing, replacing
>> constants and arguments with another set of generated arguments and
>> printing the list of these arguments at the end. It may be easiest to do
>> This in parallel with parsing.
>>
>>> Interesting thought, but I really don't see a way to make it practical.
>> Another place where this could be really useful is logging& monitoring
>>
>> If there were an option to log the above queries as
>>
>> "SELECT a, b FROM foo WHERE c = $1", (123)
>> "SELECT a, b FROM foo WHERE c = $1", (97)
>> "SELECT a, b FROM foo WHERE c = $1", (236)
> The main monitoring use_case would be pg_stat_statements,
> http://developer.postgresql.org/pgdocs/postgres/pgstatstatements.html
> which is currently pretty useless for queries without parameters

I was trying to implement something similar for pgpool-II. The user
could configure queries for which cached plans are wanted. The
configuration would have been a file containing lines in format "SELECT
* FROM foo WHERE id = ?". I did not get anything implemented, as there
were some problems. The problems were mainly with DEALLOCATE ALL called
without pgpool-II knowing it, issues with search_path and the amount of
work needed to implement parse tree matching.

It would be interesting if pg_stat_statements would be globally
available with queries using generic arguments. First, there would be an
obvious heuristic for when to cache the plan: If the average runtime of
the query is much larger than the average planning time, there is no
point in caching the plan. This would also give one option for cache hit
estimation. The hit_percent is directly available. On the other hand
pg_stat_statements could easily become a choke-point.

I would love to work on this, but I lack the needed skills. Maybe I
could take another try for writing a proof-of-concept parse tree
transformer and matcher, but I doubt I can produce anything useful.

- Anssi

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tim Bunce 2011-08-08 09:03:55 Re: plperl crash with Debian 6 (64 bit), pl/perlu, libwww and https
Previous Message Shigeru Hanada 2011-08-08 07:23:18 Re: per-column FDW options, v5