Re: Performance problems???

From: Doug McNaught <doug(at)wireboard(dot)com>
To: James Patterson <jpatterson(at)amsite(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Performance problems???
Date: 2001-10-30 20:07:41
Message-ID: m3n12828oi.fsf@belphigor.mcnaught.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

James Patterson <jpatterson(at)amsite(dot)com> writes:

> I have observed some disturbing behavior with the latest (7.1.3) version of
> PotgreSQL.
>
> In an application that I am working on with a team of folks, there is a
> specific need to execute a series of SQL statements similar to those used in
> the 'loaddb.pl' script included below. Without getting into better ways to
> increment rowid's (this code is part of another tool that we are using), I'd
> like to know why I get the following results with PostgreSQL and MySQL.
>
> In 3 separate runs I get the following PostgreSQL results:
>
> o 1 - 2000 records inserted in 12 seconds.
> o 2001 - 4000 records inserted in 16 seconds.
> o 4001 - 6000 records inserted in 20 seconds.
>
> You see, there is a clear performance degradation here that is associated
> with the number of records in the database. It appears that the main culprit
> is the update statement that is issued (see 'loaddb.pl' script below). This
> performance behavior is not expected. Especially with so few rows in such a
> small table.

One thing you should definitely do is wrap the entire load loop
((update/select/insert) * N) in a transaction. This will give you a
huge speedup. Otherwise you are forcing a disk sync after every SQL
statement.

You may still see some degradation as the table size grows, but actual
times should be more comparable to MySQL.

> In 3 separate runs I get the following MySQL results:
>
> o 1 - 2000 records inserted in 6 seconds.
> o 2001 - 4000 records inserted in 5 seconds.
> o 4001 - 6000 records inserted in 6 seconds.
>
> You see, MySQL performs as expected. There is no performance degradation
> here that is related to the number of records in the database tables.
>
> I have been a huge fan and advocate of PostgreSQL. I was stunned to see this
> behavior. I am hoping that it is either a bug that has been fixed, or that I
> can alter my PostgreSQL configuration to eliminate this behavior.
>
> I have an urgent need to resolve this situation. If I cannot solve the
> problem soon, I will be forced to drop PostgreSQL in favor of MySQL. This is
> not something that I wish to do.

I think the main problem, or one of them, is that you're not using the
proper mechanism for generating sequential numbers. If you used a
real SEQUENCE instead of a one-row table you wouldn't get the MVCC
penalty from updating that table thousands of times, which is part of
your problem I think.

I understand your issue with not wanting to change existing code, but
the fact is that a sequence is the right way to do this in PostgreSQL.
Updating a one-row table as you're doing requires a new copy of the
row to be created each time it's updated (because of MVCC) which slows
things down until VACUUM is run.

Try using a sequence along with wrapping everything in a transaction
(turn off autocommit and use BEGIN and COMMIT) and I think you'll be
pleasantly surprised.

-Doug
--
Let us cross over the river, and rest under the shade of the trees.
--T. J. Jackson, 1863

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2001-10-30 20:16:14 Re: timetz regression test is showing several DST-related failures
Previous Message Tom Lane 2001-10-30 20:05:04 Re: Odd error in complex query (7.2): Sub-SELECT uses un-GROUPed...