Skip site navigation (1) Skip section navigation (2)

Re: performance of insert/delete/update

From: "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
To: Tim Gardner <tgardner(at)codeHorse(dot)com>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: performance of insert/delete/update
Date: 2002-11-26 00:23:57
Message-ID: Pine.LNX.4.33.0211251652200.8723-100000@css120.ihs.com (view raw or flat)
Thread:
Lists: pgsql-hackerspgsql-performance
On Mon, 25 Nov 2002, Tim Gardner wrote:

> >The funny thing it, they've often avoided transactions because they
> >figured they'd be slower than just inserting the rows, and you kinda have
> >to make them sit down first before you show them the performance increase
> >from putting all those inserts into a single transaction.
> >
> >No offense meant, really.  It's just that you seemed to really doubt that
> >putting things into one transaction helped, and putting things into one
> >big transaction if like the very first postgresql lesson a lot of
> >newcomers learn. :-)
> 
> Scott,
> 
> I'm new to postgresql, and as you suggested, this is 
> counter-intuitive to me.  I would have thought that having to store 
> all the inserts to be able to roll them back would take longer.  Is 
> my thinking wrong or not relevant?  Why is this not the case?

Your thinking on this is wrong, and it is counter-intuitive to think that 
a transaction would speed things up.  Postgresql is very different from 
other databases.

Postgresql was designed from day one as a transactional database.  Which 
is why it was so bothersome that an Oracle marketroid recently was telling 
the .org folks why they shouldn't use Postgresql because it didn't have 
transactions.  Postgresql may have a few warts here and there, but not 
supporting transactions has NEVER been a problem for it.

There are two factors that make Postgresql so weird in regards to 
transactions.  One it that everything happens in a transaction (we won't 
mention truncate for a while, it's the only exception I know of.)

The next factor that makes for fast inserts of large amounts of data in a 
transaction is MVCC.  With Oracle and many other databases, transactions 
are written into a seperate log file, and when you commit, they are 
inserted into the database as one big group.  This means you write your 
data twice, once into the transaction log, and once into the database.

With Postgresql's implementation of MVCC, all your data are inserted in 
real time, with a transaction date that makes the other clients ignore 
them (mostly, other read committed transactions may or may not see them.)

If there are indexes to update, they are updated in the same "invisible 
until committed" way.

All this means that your inserts don't block anyone else's reads as well.

This means that when you commit, all postgresql does is make them visible.

In the event you roll back a transaction, the tuples are all just marked 
as dead and they get ignored.

It's interesting when you work with folks who came from other databases.  
My coworker, who's been using Postgresql for about 2 years now, had an 
interesting experience when he first started here.  He was inserting 
something like 10,000 rows.  He comes over and tells me there must be 
something wrong with the database, as his inserts have been running for 10 
minutes, and he's not even halfway through.  So I had him stop the 
inserts, clean out the rows (it was a new table for a new project) and 
wrap all 10,000 inserts into a transaction.  What had been running for 10 
minutes now ran in about 30 seconds.

He was floored.  

Well, good luck on using postgresql, and definitely keep in touch with the 
performance and general mailing lists.  They're a wealth of useful info.


In response to

Responses

pgsql-performance by date

Next:From: scott.marloweDate: 2002-11-26 00:30:00
Subject: Re: performance of insert/delete/update
Previous:From: Rod TaylorDate: 2002-11-26 00:20:03
Subject: Re: performance of insert/delete/update

pgsql-hackers by date

Next:From: scott.marloweDate: 2002-11-26 00:30:00
Subject: Re: performance of insert/delete/update
Previous:From: Rod TaylorDate: 2002-11-26 00:20:03
Subject: Re: performance of insert/delete/update

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group