Quick Links

Insert values() per-statement overhead

From:	Vladimir Sitnikov <sitnikov(dot)vladimir(at)gmail(dot)com>
To:	Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Insert values() per-statement overhead
Date:	2016-01-15 10:17:12
Message-ID:	CAB=Je-EDZwVH7FgpDOa0ad8a7a6D=1QoYmvW2kQqSAML2gf+Uw@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi,

There is a finding that insert(x) values(y);insert(x) values(z); is
2-4 times slower than insert(..) values(y),(z);
see [1], [2].

In other words, there is a significant per-statement overhead even
though server-prepared statements are properly used.
The issue is reproducible in 9.5rc1.

Is it something that was discussed previously? (I was not able to find
that in archives)
Is it something that can be improved? (e.g. "insert node state"
caching across execute calls, improve performance of "INSERT %u %u"
generation, etc)

Even though I understand there will always be _some_ per-statement
overhead, such a hight overhead plays against common case of using
ORMs.
End-users are just stuck with insert(...) values(...);

1) Java's standard way of batching statements is
"PreparedStatement#addBatch()". Unfortunately, backend protocol does
not support statement batching.
One does not simply teach Hibernate/EclipseLink,etc etc to use
PostgreSQL's-specific COPY.
Note: I am not talking about network roundtrips here. I'm just
highlighting that there is no way to execute "bind bind bind
executebatch" sequence at the protocol level.

2) One might consider "transparent rewrite of insert() batches into a
single insert() values(),(),() statement" at JDBC driver level, but it
is hard to get right as there is no easy way to parse a query. It is
really expected that every PostgreSQL connector would implement SQL
parser & insert rewriter?

3) Transparent rewrites (including "rewrite inserts to COPY") would
fail to provide "number of modified rows" for each row. Error
semantics is different as well.

4) COPY does not support UPSERT, does it?

My profiler (Instruments in Mac OS) shows that significant time is
spent in standard_ExecutorStart: see [3]
In fact, the time spent in standard_ExecutorStart even exceeds the
time spent in standard_ExecutorRun.

[1]: http://www.postgresql.org/message-id/55130DC8.2070508@redhat.com
[2]: https://github.com/pgjdbc/pgjdbc/pull/491#issuecomment-171780054
[3]: https://github.com/pgjdbc/pgjdbc/pull/491#issuecomment-171908974

Vladimir Sitnikov

Responses

Re: Insert values() per-statement overhead at 2016-01-15 10:34:57 from Andres Freund

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andres Freund	2016-01-15 10:34:57	Re: Insert values() per-statement overhead
Previous Message	Simon Riggs	2016-01-15 10:08:30	Re: Stream consistent snapshot via a logical decoding plugin as a series of INSERTs