Re: Cached plans and statement generalization

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Cached plans and statement generalization
Date: 2017-05-11 15:12:33
Message-ID: 20170511151233.GE17200@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, May 10, 2017 at 07:11:07PM +0300, Konstantin Knizhnik wrote:
> I am going to continue work on this patch I will be glad to receive any
> feedback and suggestions for its improvement.
> In most cases, applications are not accessing Postgres directly, but using
> some connection pooling layer and so them are not able to use prepared
> statements.
> But at simple OLTP Postgres spent more time on building query plan than on
> execution itself. And it is possible to speedup Postgres about two times at
> such workload!
> Another alternative is true shared plan cache. May be it is even more
> perspective approach, but definitely much more invasive and harder to
> implement.

Can we back up and get an overview of what you are doing and how you are
doing it? Our TODO list suggests this order for successful patches:

Desirability -> Design -> Implement -> Test -> Review -> Commit

You kind of started at the Implementation/patch level, which makes it
hard to evaluate.

I think everyone agrees on the Desirability of the feature, but the
Design is the tricky part. I think the design questions are:

* What information is stored about cached plans?
* How are the cached plans invalidated?
* How is a query matched against a cached plan?

Looking at the options, ideally the plan would be cached at the same
query stage as the stage where the incoming query is checked against the
cache. However, caching and checking at the same level offers no
benefit, so they are going to be different. For example, caching a
parse tree at the time it is created, then checking at the same point if
the incoming query is the same doesn't help you because you already had
to create the parse tree get to that point.

A more concrete example is prepared statements. They are stored at the
end of planning and matched in the parser. However, you can easily do
that since the incoming query specifies the name of the prepared query,
so there is no trick to matching.

The desire is to cache as late as possible so you cache more work and
you have more detail about the referenced objects, which helps with
cache invalidation. However, you also want to do cache matching as
early as possible to improve performance.

So, let's look at some options. One interesting idea from Doug Doole
was to do it between the tokenizer and parser. I think they are glued
together so you would need a way to run the tokenizer separately and
compare that to the tokens you stored for the cached plan. The larger
issue is that prepared plans already are checked after parsing, and we
know they are a win, so matching any earlier than that just seems like
overkill and likely to lead to lots of problems.

So, you could do it after parsing but before parse-analysis, which is
kind of what prepared queries do. One tricky problem is that we don't
bind the query string tokens to database objects until after parse
analysis.

Doing matching before parse-analysis is going to be tricky, which is why
there are so many comments about the approach. Changing search_path can
certainly affect it, but creating objects in earlier-mentioned schemas
can also change how an object reference in a query is resolved. Even
obscure things like the creation of a new operator that has higher
precedence in the query could change the plan, though am not sure if
our prepared query system even handles that properly.

Anyway, that is my feedback. I would like to get an overview of what
you are trying to do and the costs/benefits of each option so we can
best guide you.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Verite 2017-05-11 15:16:06 Re: export import bytea from psql
Previous Message Simon Riggs 2017-05-11 15:08:11 Re: Time based lag tracking for logical replication