James Patterson <jpatterson(at)amsite(dot)com> writes:
> I have observed some disturbing behavior with the latest (7.1.3) version of
> In an application that I am working on with a team of folks, there is a
> specific need to execute a series of SQL statements similar to those used in
> the 'loaddb.pl' script included below. Without getting into better ways to
> increment rowid's (this code is part of another tool that we are using), I'd
> like to know why I get the following results with PostgreSQL and MySQL.
> In 3 separate runs I get the following PostgreSQL results:
> o 1 - 2000 records inserted in 12 seconds.
> o 2001 - 4000 records inserted in 16 seconds.
> o 4001 - 6000 records inserted in 20 seconds.
> You see, there is a clear performance degradation here that is associated
> with the number of records in the database. It appears that the main culprit
> is the update statement that is issued (see 'loaddb.pl' script below). This
> performance behavior is not expected. Especially with so few rows in such a
> small table.
One thing you should definitely do is wrap the entire load loop
((update/select/insert) * N) in a transaction. This will give you a
huge speedup. Otherwise you are forcing a disk sync after every SQL
You may still see some degradation as the table size grows, but actual
times should be more comparable to MySQL.
> In 3 separate runs I get the following MySQL results:
> o 1 - 2000 records inserted in 6 seconds.
> o 2001 - 4000 records inserted in 5 seconds.
> o 4001 - 6000 records inserted in 6 seconds.
> You see, MySQL performs as expected. There is no performance degradation
> here that is related to the number of records in the database tables.
> I have been a huge fan and advocate of PostgreSQL. I was stunned to see this
> behavior. I am hoping that it is either a bug that has been fixed, or that I
> can alter my PostgreSQL configuration to eliminate this behavior.
> I have an urgent need to resolve this situation. If I cannot solve the
> problem soon, I will be forced to drop PostgreSQL in favor of MySQL. This is
> not something that I wish to do.
I think the main problem, or one of them, is that you're not using the
proper mechanism for generating sequential numbers. If you used a
real SEQUENCE instead of a one-row table you wouldn't get the MVCC
penalty from updating that table thousands of times, which is part of
your problem I think.
I understand your issue with not wanting to change existing code, but
the fact is that a sequence is the right way to do this in PostgreSQL.
Updating a one-row table as you're doing requires a new copy of the
row to be created each time it's updated (because of MVCC) which slows
things down until VACUUM is run.
Try using a sequence along with wrapping everything in a transaction
(turn off autocommit and use BEGIN and COMMIT) and I think you'll be
Let us cross over the river, and rest under the shade of the trees.
--T. J. Jackson, 1863
In response to
pgsql-hackers by date
|Next:||From: Tom Lane||Date: 2001-10-30 20:16:14|
|Subject: Re: timetz regression test is showing several DST-related failures |
|Previous:||From: Tom Lane||Date: 2001-10-30 20:05:04|
|Subject: Re: Odd error in complex query (7.2): Sub-SELECT uses un-GROUPed... |