Re: INSERTing lots of data

From: Szymon Guz <mabewlun(at)gmail(dot)com>
To: Joachim Worringen <joachim(dot)worringen(at)iathh(dot)de>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: INSERTing lots of data
Date: 2010-05-28 09:48:16
Message-ID: AANLkTimvkkS7CYZtuIbIuTTiaHDJMLWkpPfji41Jh9-b@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

2010/5/28 Joachim Worringen <joachim(dot)worringen(at)iathh(dot)de>

> Greetings,
>
> my Python application (http://perfbase.tigris.org) repeatedly needs to
> insert lots of data into an exsting, non-empty, potentially large table.
> Currently, the bottleneck is with the Python application, so I intend to
> multi-thread it. Each thread should work on a part of the input file.
>
> I already multi-threaded the query part of the application, which requires
> to use one connection per thread - cursors a serialized via a single
> connection.
>
> Provided that
> - the threads use their own connection
> - the threads perform all INSERTs within a single transaction
> - the machine has enough resources
>
> will I get a speedup? Or will table-locking serialize things on the server
> side?
>
> Suggestions for alternatives are welcome, but the data must go through the
> Python application via INSERTs (no bulk insert, COPY etc. possible)
>
>
Remember about Python's GIL in some Python implementations so those threads
could be serialized at the Python level.

This is possible that those inserts will be faster. The speed depends on the
table structure, some constraints and triggers and even database
configuration. The best answer is: just check it on some test code, make a
simple multithreaded aplication and try to do the inserts and check that
out.

regards
Szymon Guz

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Joachim Worringen 2010-05-28 10:00:58 Re: INSERTing lots of data
Previous Message Joachim Worringen 2010-05-28 09:41:19 INSERTing lots of data