Re: Cached plans and statement generalization

From: Douglas Doole <dougdoole(at)gmail(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>, Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Cached plans and statement generalization
Date: 2017-05-11 17:39:58
Message-ID: CADE5jYJ3g-c6R45cD7KUmfpiOPRzhmf9t820d4cQzmnsnaU6eA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
> One interesting idea from Doug Doole was to do it between the tokenizer
> and parser. I think they are glued together so you would need a way to run
> the tokenizer separately and compare that to the tokens you stored for the
> cached plan.
>

When I did this, we had the same problem that the tokenizer and parser were
tightly coupled. Fortunately, I was able to do as you suggest and run the
tokenizer separately to do my analysis.

So my model was to do statement generalization before entering the compiler
at all. I would tokenize the statement to find the literals and generate a
new statement string with placeholders. The new string would the be passed
to the compiler which would then tokenize and parse the reworked statement.

This means we incurred the cost of tokenizing twice, but the tokenizer was
lightweight enough that it wasn't a problem. In exchange I was able to do
statement generalization without touching the compiler - the compiler saw
the generalized statement text as any other statement and handled it in the
exact same way. (There was just a bit of new code around variable binding.)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Fetter 2017-05-11 18:03:44 Re: [PATCH v2] Progress command to monitor progression of long running SQL queries
Previous Message Simon Riggs 2017-05-11 17:29:30 Re: Time based lag tracking for logical replication