Skip site navigation (1) Skip section navigation (2)

Re: inserting large number of rows was: Re: Increasing

From: Erki Kaldjärv <erki(at)webware(dot)ee>
To: Bill Moran <wmoran(at)potentialtech(dot)com>
Cc: Qing Zhao <qzhao(at)quotefx(dot)net>, pgsql-performance(at)postgresql(dot)org
Subject: Re: inserting large number of rows was: Re: Increasing
Date: 2004-02-03 07:03:05
Message-ID: 401F47A9.7070106@webware.ee (view raw or flat)
Thread:
Lists: pgsql-performance
You could do high speed inserts with COPY command:
http://developer.postgresql.org/docs/postgres/sql-copy.html

Check whenether your database adapter/client lib supports it (i guess it 
does).

Note that it doesnt help very much if there are fk's/triggers's on the 
target table.

Bill Moran wrote:

> I must have missed this post when it was made earlier.  Pardon the 
> noise if
> my suggestion has already been made.
>
> Unlike MySQL (and possibly other database servers) PostgreSQL is 
> faster when
> inserting inside a transaction.  Depending on the method in which you are
> actually adding the records.
>
> In my own experience (generating a list of INSERT statements from a perl
> script and using psql to execute them) the difference in performance was
> incredibly dramatic when I added a "BEGIN WORK" at the beginning and
> "COMMIT WORK" at the end.
>
> scott.marlowe wrote:
>
>> On Mon, 2 Feb 2004, Qing Zhao wrote:
>>
>>
>>> I am new here. I have a question related to this in some way.
>>>
>>> Our web site needs to upload a large volume of data into Postgres at 
>>> a time. The performance deterioates as number of rows becomes 
>>> larger.  When it reaches 2500 rows, it never come back to GUI. Since 
>>> the tests were run through GUI, my suspision is
>>> that it might be caused by the way the application server talking to 
>>> Postgres server, the connections, etc.. What might be the factors 
>>> involved here? Does anyone know?
>>
>>
>>
>> Actually, I'm gonna go out on a limb here and assume two things:
>>
>> 1. you've got lotsa fk/pk relationships setup.
>> 2. you're analyzing the table empty before loading it up.
>>
>> What happens in this instance is that the analyze on an empty, or 
>> nearly so, table, means that during the inserts, postgresql thinks 
>> you have only a few rows.  At first, this is fine, as pgsql will seq 
>> scan the  tables to make sure there is a proper key in both.  As the 
>> number of rows increases, the planner needs to switch to index scans 
>> but doesn't, because it doesn't know that the number of rows is 
>> increasing.
>>
>> Fix:  insert a few hundred rows, run analyze, check to see if the 
>> explain for inserts is showing index scans or not.  If not, load a 
>> few more hundred rows, analyze, rinse, repeat.
>>
>> Also, look for fk/pk mismatches.  I.e. an int4 field pointing to an 
>> int8 field.  That's a performance killer, so if the pk/fk types don't 
>> match, see if you can change your field types to match and try again.
>>
>>
>> ---------------------------(end of broadcast)---------------------------
>> TIP 2: you can get off all lists at once with the unregister command
>>     (send "unregister YourEmailAddressHere" to majordomo(at)postgresql(dot)org)
>>
>
>



In response to

pgsql-performance by date

Next:From: Czuczy GergelyDate: 2004-02-03 07:25:38
Subject: PQexecParams and types
Previous:From: Christopher Kings-LynneDate: 2004-02-03 05:52:41
Subject: Re: MySQL+InnoDB vs. PostgreSQL test?

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group