Resurrected thread: Speed improvement - Group batch Insert - Rewrite the INSERT at the driver level (using a parameter)

From: Jeremy Whiting <jwhiting(at)redhat(dot)com>
To: "pgsql-jdbc(at)postgresql(dot)org" <pgsql-jdbc(at)postgresql(dot)org>
Cc: magog001(at)web(dot)de
Subject: Resurrected thread: Speed improvement - Group batch Insert - Rewrite the INSERT at the driver level (using a parameter)
Date: 2015-03-25 19:34:32
Message-ID: 55130DC8.2070508@redhat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-jdbc

Hi,
I see this conversation [1] occurred back in 2009. I'd like to
resurrect the thread.

In response to the questions by J. W. Ulbts I have some performances
results demonstrating the benefit. Also a response to his suggestion for
using COPY.

"Where exactly is the performance benefit that you see coming from?"

To answer this question a simple java JDBC project [2] was created to
demonstrate the benefit. In the project are several benchmarks grouped
by individual statements (IndividualStatementsTest) or re-written
multi-insert (MultiInsertStatementTest). Each group has 3 different
statement/row sizes (2/5/51) which are configurable. Named "SMALL",
"MEDIUM" and "LARGE" respectively. The default sizes are not intended to
be representative of any particular use case. As everyone has differing
opinions of what is appropriate.

The project is easy to set up and run. Details are in the README file.

The attached normalized graph of ops/sec demonstrating the benefit at
different levels of concurrency. The results are for a client machine
and a dedicated server system. Details of the db system are: 32 core
@2.90GHz, 283GB memory, couple of enterprise SSD for db storage. WAL and
tablespace are on separate devices.

"If your use case is just "I want to do bulk inserts as fast as
possible" then perhaps the newly merged COPY suport is a better way to go."

For use cases involving applications using an ORM like Hibernate COPY
isn't supported nor likely to. Hibernate doesn't have any concept of
handling files on the database system.

What are the thoughts for having this optimization introduced into
pgjdbc driver ?

Regards,
Jeremy

[1] http://www.postgresql.org/message-id/828427796@web.de
[2] https://github.com/whitingjr/batch-rewrite-statements-perf

--
Jeremy Whiting
Senior Software Engineer, JBoss Performance Team
Red Hat

------------------------------------------------------------
Registered Address: RED HAT UK LIMITED, 64 Baker Street, 4th Floor, Paddington. London. United Kingdom W1U 7DF
Registered in UK and Wales under Company Registration No. 3798903 Directors: Michael Cunningham (US), Charles Peters (US), Matt Parson (US) and Michael O'Neill(Ireland)

Attachment Content-Type Size
image/jpeg 20.0 KB
image/jpeg 20.2 KB
image/jpeg 20.1 KB

Responses

Browse pgsql-jdbc by date

  From Date Subject
Next Message Jeremy Whiting 2015-03-26 16:33:29 Multi insert statement and getUpdateCount().
Previous Message Pavel Raiskup 2015-03-24 08:06:56 Re: Remove obsolete maven-ant-tasks