Skip site navigation (1) Skip section navigation (2)

Re: Postgres insert performance and storage requirement compared to Oracle

From: Scott Carey <scott(at)richrelevance(dot)com>
To: Mladen Gogala <mladen(dot)gogala(at)vmsinfo(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Postgres insert performance and storage requirement compared to Oracle
Date: 2010-10-27 17:48:35
Message-ID: E2FA14BE-A47B-4673-9DE1-851EB74009F2@richrelevance.com (view raw or flat)
Thread:
Lists: pgsql-hackerspgsql-performance
On Oct 26, 2010, at 2:54 PM, Mladen Gogala wrote:

> On 10/26/2010 5:27 PM, Jon Nelson wrote:
>> start loop:
>>   populate rows in temporary table
>>   insert from temporary table into permanent table
>>   truncate temporary table
>>   loop
>> 
>> I do something similar, where I COPY data to a temporary table, do
>> lots of manipulations, and then perform a series of INSERTS from the
>> temporary table into a permanent table.
>> 
> 
> 1) It's definitely not faster because you have to insert into the 
> temporary table, in addition to inserting into the permanent table.

It is almost always significantly faster than a direct bulk load into a table.  
* The temp table has no indexes, the final table usually does, bulk operations on indexes are faster than per row operations.
* The final table might require both updates and inserts, doing these in bulk from a temp stage table is far faster than per row.
* You don't even have to commit after the merge from the temp table, and can loop until its all done, then commit -- though this can have table/index bloat implications if doing updates.

> 2) This is what I had in mind:
> 
> mgogala=# create table a(c1 int);
> CREATE TABLE
> mgogala=# create temporary table t1(c1 int) on commit delete rows;
> CREATE TABLE
> mgogala=# begin;
> BEGIN
> mgogala=# insert into t1 select generate_series(1,1000);
> INSERT 0 1000
> mgogala=# insert into a select * from t1;
> INSERT 0 1000
> mgogala=# commit;
> COMMIT
> mgogala=# select count(*) from a;
>  count
> -------
>   1000
> (1 row)
> 
> mgogala=# select count(*) from t1;
>  count
> -------
>      0
> (1 row)
> 
> The table is created with "on commit obliterate rows" option which means 
> that there is no need to do "truncate". The "truncate" command is a 
> heavy artillery. Truncating a temporary table is like shooting ducks in 
> a duck pond, with a howitzer.

???  Test it.  DELETE is slow, truncate is nearly instantaneous for normal tables.  For temp tables its the same thing.  Maybe in Oracle TRUNCATE is a howitzer, in Postgres its lightweight.  Your loop above requires a commit after every 1000 rows.  What if you require that all rows are seen at once or not at all?  What if you fail part way through?  One big transaction is often a better idea and/or required.  Especially in postgres, with no undo-log, bulk inserts in one large transaction work out very well -- usually better than multiple smaller transactions.
> 
> -- 
> 
> Mladen Gogala
> Sr. Oracle DBA
> 1500 Broadway
> New York, NY 10036
> (212) 329-5251
> http://www.vmsinfo.com
> The Leader in Integrated Media Intelligence Solutions
> 
> 
> 
> 
> -- 
> Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance


In response to

Responses

pgsql-performance by date

Next:From: Jon NelsonDate: 2010-10-27 17:59:40
Subject: Re: temporary tables, indexes, and query plans
Previous:From: Mladen GogalaDate: 2010-10-27 17:44:24
Subject: Re: temporary tables, indexes, and query plans

pgsql-hackers by date

Next:From: Kevin GrittnerDate: 2010-10-27 18:04:54
Subject: Re: Simplifying replication
Previous:From: Josh BerkusDate: 2010-10-27 17:37:24
Subject: Re: Simplifying replication

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group