Re: Transaction Performance Question

From: "Rick Gigger" <rick(at)alpinenetworking(dot)com>
To: <pgsql-general(at)postgresql(dot)org>
Subject: Re: Transaction Performance Question
Date: 2003-10-30 16:38:44
Message-ID: 00db01c39f04$49284f90$0700a8c0@trogdor
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> > In the following situation:
> >
> > You do a large transaction where lots of rows are update
> > All of your tables/indexes cached in memory
> >
> > When are the updated rows written out to disk? When they are updated
inside
> > the transaction, or when the transaction is completed?
>
> The data is written out but not made real, so to speak, during each
> update. I.e. the updates individually add all these rows. At the end of
> the transaction, if we rollback, all the tuples that were written out are
> simply not committed, and therefore the last version of that record
> remains the last one in the chain.
>
> If the transaction is committed then each tuple becomes the last in its
> chain (could it be second to last because of other transactions? I'm not
> sure.)

I realize how commiting the transaction works from the users perspective I
am thinking here about the internal implementation. For instance if I do an
update inside a transaction, postgres could, in order to make sure data was
not lost, make sure that the data was flushed out to disk and fsynced. That
way it could tell me if there was a problem writing that data out to disk.
But if it is in the middle of a transaction I would think that you could
update the tuples cached in memory and return, then start sending the tuples
out to disk in the background. When you issue the commit of course
everything would need to be flushed out to disk and fsynced and any errors
with it could be reported before the transaction was finished and it could
still be rolled back.

It seems like if I had to update say 39,000 rows all with separate update
statements that it would be a lot faster if each update statement could just
update memory and then return and flush out to disk in the background while
I continue processing the other updates. Maybe it does do this already or
maybe it is a bad idea for some reason. I don't understand the inner
workings of postgres to say. That is why I'm asking.

Also is there any way to issue a whole bunch of updates together like this
faster than just issuing 39,000 individual update statements.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Ken Guest 2003-10-30 16:46:14 Re: formatting of SQL sent by PHP to postgres
Previous Message Bruce Momjian 2003-10-30 16:23:57 Re: dump schema schema only?