Re: Cached plans and statement generalization

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Douglas Doole <dougdoole(at)gmail(dot)com>
Cc: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Cached plans and statement generalization
Date: 2017-05-11 18:15:05
Message-ID: 20170511181505.GA25894@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, May 11, 2017 at 05:39:58PM +0000, Douglas Doole wrote:
> One interesting idea from Doug Doole was to do it between the tokenizer and
> parser.  I think they are glued together so you would need a way to run the
> tokenizer separately and compare that to the tokens you stored for the
> cached plan.
>
>
> When I did this, we had the same problem that the tokenizer and parser were
> tightly coupled. Fortunately, I was able to do as you suggest and run the
> tokenizer separately to do my analysis. 
>
> So my model was to do statement generalization before entering the compiler at
> all. I would tokenize the statement to find the literals and generate a new
> statement string with placeholders. The new string would the be passed to the
> compiler which would then tokenize and parse the reworked statement.
>
> This means we incurred the cost of tokenizing twice, but the tokenizer was
> lightweight enough that it wasn't a problem. In exchange I was able to do
> statement generalization without touching the compiler - the compiler saw the
> generalized statement text as any other statement and handled it in the exact
> same way. (There was just a bit of new code around variable binding.)

Good point. I think we need to do some measurements to see if the
parser-only stage is actually significant. I have a hunch that
commercial databases have much heavier parsers than we do.

This split would also not work if the scanner feeds changes back into
the parser. I know C does that for typedefs but I don't think we do.

Ideally I would like to see percentage-of-execution numbers for typical
queries for scan, parse, parse-analysis, plan, and execute to see where
the wins are.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Euler Taveira 2017-05-11 18:25:14 Re: Logical decoding truncate
Previous Message David Fetter 2017-05-11 18:03:44 Re: [PATCH v2] Progress command to monitor progression of long running SQL queries