Re: inserting large number of rows was: Re: Increasing

From: Erki Kaldjärv <erki(at)webware(dot)ee>
To: Bill Moran <wmoran(at)potentialtech(dot)com>
Cc: Qing Zhao <qzhao(at)quotefx(dot)net>, pgsql-performance(at)postgresql(dot)org
Subject: Re: inserting large number of rows was: Re: Increasing
Date: 2004-02-03 07:03:05
Message-ID: 401F47A9.7070106@webware.ee
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

You could do high speed inserts with COPY command:
http://developer.postgresql.org/docs/postgres/sql-copy.html

Check whenether your database adapter/client lib supports it (i guess it
does).

Note that it doesnt help very much if there are fk's/triggers's on the
target table.

Bill Moran wrote:

> I must have missed this post when it was made earlier. Pardon the
> noise if
> my suggestion has already been made.
>
> Unlike MySQL (and possibly other database servers) PostgreSQL is
> faster when
> inserting inside a transaction. Depending on the method in which you are
> actually adding the records.
>
> In my own experience (generating a list of INSERT statements from a perl
> script and using psql to execute them) the difference in performance was
> incredibly dramatic when I added a "BEGIN WORK" at the beginning and
> "COMMIT WORK" at the end.
>
> scott.marlowe wrote:
>
>> On Mon, 2 Feb 2004, Qing Zhao wrote:
>>
>>
>>> I am new here. I have a question related to this in some way.
>>>
>>> Our web site needs to upload a large volume of data into Postgres at
>>> a time. The performance deterioates as number of rows becomes
>>> larger. When it reaches 2500 rows, it never come back to GUI. Since
>>> the tests were run through GUI, my suspision is
>>> that it might be caused by the way the application server talking to
>>> Postgres server, the connections, etc.. What might be the factors
>>> involved here? Does anyone know?
>>
>>
>>
>> Actually, I'm gonna go out on a limb here and assume two things:
>>
>> 1. you've got lotsa fk/pk relationships setup.
>> 2. you're analyzing the table empty before loading it up.
>>
>> What happens in this instance is that the analyze on an empty, or
>> nearly so, table, means that during the inserts, postgresql thinks
>> you have only a few rows. At first, this is fine, as pgsql will seq
>> scan the tables to make sure there is a proper key in both. As the
>> number of rows increases, the planner needs to switch to index scans
>> but doesn't, because it doesn't know that the number of rows is
>> increasing.
>>
>> Fix: insert a few hundred rows, run analyze, check to see if the
>> explain for inserts is showing index scans or not. If not, load a
>> few more hundred rows, analyze, rinse, repeat.
>>
>> Also, look for fk/pk mismatches. I.e. an int4 field pointing to an
>> int8 field. That's a performance killer, so if the pk/fk types don't
>> match, see if you can change your field types to match and try again.
>>
>>
>> ---------------------------(end of broadcast)---------------------------
>> TIP 2: you can get off all lists at once with the unregister command
>> (send "unregister YourEmailAddressHere" to majordomo(at)postgresql(dot)org)
>>
>
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Czuczy Gergely 2004-02-03 07:25:38 PQexecParams and types
Previous Message Christopher Kings-Lynne 2004-02-03 05:52:41 Re: MySQL+InnoDB vs. PostgreSQL test?