Re: Slowness of extended protocol

From: Vladimir Sitnikov <sitnikov(dot)vladimir(at)gmail(dot)com>
To: Shay Rojansky <roji(at)roji(dot)org>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Greg Stark <stark(at)mit(dot)edu>, Tatsuo Ishii <ishii(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Slowness of extended protocol
Date: 2016-08-09 08:50:03
Message-ID: CAB=Je-FHSwrbJiTcTDeT4J3y_+WvN1d+S+26aesr85swocb7EA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Shay>There are many scenarios where connections are very short-lived (think
about webapps where a pooled connection is allocated per-request and reset
in between)

Why the connection is reset in between in the first place?
In pgjdbc we do not reset per-connection statement cache, thus we easily
reuse named statements for pooled connections.

Shay>and the extra roundtrip that preparing entails is too much.

When server-prepared statement gets reused, neither parse neither describe
are used.

Shay>There are also many scenarios where you're not necessarily going to
send the same query multiple times in a single connection lifespan, so
preparing is again out of the question.

Can you list at least one scenario of that kind, so we can code it into
pgbench (or alike) and validate "simple vs prepared" performance?

Shay>And more generally, there's no reason for a basic, non-prepared
execution to be slower than it can be.

That's too generic. If the performance for "end-to-end cases" is just fine,
then it is not worth optimizing further. Typical application best practice
is to reuse SQL text (for both security and performance point of views), so
in typical applications I've seen, query text was reused, thus it naturally
was handled by server-prepared logic.

Let me highlight another direction: current execution of server-prepared
statement requires some copying of "parse tree" (or whatever). I bet it
would be much better investing in removal of that copying rather than
investing into "make one-time queries faster" thing. If we could make
"Exec" processing faster, it would immediately improve tons of applications.

Shay>Of course we can choose a different query to benchmark instead of
SELECT 1 - feel free to propose one (or several).

I've tried pgbench -M prepared, and it is way faster than pgbench -M simple.

Once again: all cases I have in mind would benefit from reusing
server-prepared statements. In other words, after some warmup the
appication would use just Bind-Execute-Sync kind of messages, and it would
completely avoid Parse/Describe ones.

If a statement is indeed "one-time" statement, then I do not care much how
long it would take to execute.

Shay>FYI in Npgsql specifically describe isn't used to get any knowledge
about parameters - users must populate the correct parameters or query
execution fails.

I think the main reason to describe for pgjdbc is to get result oids.
pgjdbc is not "full binary", thus it has to be careful which fields it
requests in binary format.
That indeed slows down "unknown queries", but as the query gets reused,
pgjdbc switches to server-prepared execution, and eliminates parse-describe
overheads completely.

Vladimir

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Petr Jelinek 2016-08-09 09:14:40 Re: Logical Replication WIP
Previous Message Julien Rouhaud 2016-08-09 08:34:53 Small issues in syncrep.c