Re: INSERT times - same storage space but more fields -> much slower inserts

From: Matthew Wakeling <matthew(at)flymine(dot)org>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: INSERT times - same storage space but more fields -> much slower inserts
Date: 2009-04-15 11:50:51
Message-ID: alpine.DEB.2.00.0904151210400.4053@aragorn.flymine.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Tue, 14 Apr 2009, Stephen Frost wrote:
> What does your test harness currently look like, and what would you like
> to see to test the binary-format COPY? I'd be happy to write up the
> code necessary to implement binary-format COPY for this.

If anyone needs this code in Java, we have a version at
http://www.intermine.org/

Download source code: http://www.intermine.org/wiki/SVNCheckout

Javadoc: http://www.intermine.org/api/

The code is contained in the org.intermine.sql.writebatch package, in the
intermine/objectstore/main/src/org/intermine/sql/writebatch directory in
the source.

The public interface is org.intermine.sql.writebatch.Batch.

The Postgres-specific binary COPY code is in
org.intermine.sql.writebatch.BatchWriterPostgresCopyImpl.

The implementation unfortunately relies on a very old modified version of
the Postgres JDBC driver, which is in the intermine/objectstore/main/lib
directory.

The code is released under the LGPL, and we would appreciate notification
if it is used.

The code implements quite a sophisticated system for writing rows to
database tables very quickly. It batches together multiple writes into
COPY statements, and writes them in the background in another thread,
while fully honouring flush calls. When it is using the database
connection is well-defined. I hope someone can find it useful.

Matthew

--
-. .-. .-. .-. .-. .-. .-. .-. .-. .-. .-. .-. .-.
||X|||\ /|||X|||\ /|||X|||\ /|||X|||\ /|||X|||\ /|||X|||\ /|||
|/ \|||X|||/ \|||X|||/ \|||X|||/ \|||X|||/ \|||X|||/ \|||X|||/
' `-' `-' `-' `-' `-' `-' `-' `-' `-' `-' `-' `-'

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Matthew Wakeling 2009-04-15 11:57:57 Re: INSERT times - same storage space but more fields -> much slower inserts
Previous Message Grzegorz Jaśkiewicz 2009-04-15 11:19:11 Re: error updating a very large table