Re: performance of insert/delete/update

From: "Josh Berkus" <josh(at)agliodbs(dot)com>
To: "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>, Wei Weng <wweng(at)kencast(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, <pgsql-performance(at)postgresql(dot)org>
Subject: Re: performance of insert/delete/update
Date: 2002-11-21 22:26:40
Message-ID: web-1836282@davinci.ethosmedia.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

Scott,

> The absolutely most important thing to do to speed up inserts and
> updates
> is to squeeze as many as you can into one transaction. Within
> reason, of
> course. There's no great gain in putting more than a few thousand
> together at a time. If your application is only doing one or two
> updates
> in a transaction, it's going to be slower in terms of records written
> per
> second than an application that is updating 100 rows in a
> transaction.

This only works up to the limit of the memory you have available for
Postgres. If the updates in one transaction exceed your available
memory, you'll see a lot of swaps to disk log that will slow things
down by a factor of 10-50 times.

> Reducing triggers and foreign keys on the inserted tables to a
> minimum
> helps.

... provided that this will not jeapordize your data integrity. If you
have indispensable triggers in PL/pgSQL, re-qriting them in C will make
them, and thus updates on their tables, faster.

Also, for foriegn keys, it speeds up inserts and updates on parent
tables with many child records if the foriegn key column in the child
table is indexed.

> Putting WAL (e.g. $PGDATA/pg_xlog directory) on it's own drive(s).
>
> Putting indexes that have to be updated during inserts onto their own
>
> drive(s).
>
> Performing regular vacuums on heavily updated tables.
>
> Also, if your hardware is reliable, you can turn off fsync in
> postgresql.conf. That can increase performance by anywhere from 2 to
> 10
> times, depending on your application.

It can be dangerous though ... in the event of a power outage, for
example, your database could be corrupted and difficult to recover. So
... "at your own risk".

I've found that switching from fsync to fdatasync on Linux yields
marginal performance gain ... about 10-20%.

Also, if you are doing large updates (many records at once) you may
want to increase WAL_FILES and CHECKPOINT_BUFFER in postgresql.conf to
allow for large transactions.

Finally, you want to structure your queries so that you do the minimum
number of update writes possible, or insert writes. For example, a
procedure that inserts a row, does some calculations, and then modifies
several fields in that row is going to slow stuff down significantly
compared to doing the calculations as variables and only a single
insert. Certainly don't hit a table with 8 updates, each updating one
field instead of a single update statement.

-Josh Berkus

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephan Szabo 2002-11-21 22:33:26 Re: Optimizer & boolean syntax
Previous Message Wei Weng 2002-11-21 22:23:57 Re: performance of insert/delete/update

Browse pgsql-performance by date

  From Date Subject
Next Message scott.marlowe 2002-11-21 22:37:47 Re: [ADMIN] H/W RAID 5 on slower disks versus no raid on
Previous Message Wei Weng 2002-11-21 22:23:57 Re: performance of insert/delete/update