Skip site navigation (1) Skip section navigation (2)

Re: High update activity, PostgreSQL vs BigDBMS

From: "Alex Turner" <armtuk(at)gmail(dot)com>
To: "Guy Rouillier" <guyr-ml1(at)burntmail(dot)com>
Cc: "PostgreSQL Performance" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: High update activity, PostgreSQL vs BigDBMS
Date: 2006-12-30 03:22:35
Message-ID: 33c6269f0612291922s2a784533n9e6652af79158da1@mail.gmail.com (view raw or flat)
Thread:
Lists: pgsql-performance
You should search the archives for Luke Lonegran's posting about how IO in
Postgresql is significantly bottlenecked because it's not async.  A 12 disk
array is going to max out Postgresql's max theoretical write capacity to
disk, and therefore BigRDBMS is always going to win in such a config.  You
can also look towards Bizgres which allegedly elimates some of these
problems, and is cheaper than most BigRDBMS products.

Alex.

On 12/28/06, Guy Rouillier <guyr-ml1(at)burntmail(dot)com> wrote:
>
> I don't want to violate any license agreement by discussing performance,
> so I'll refer to a large, commercial PostgreSQL-compatible DBMS only as
> BigDBMS here.
>
> I'm trying to convince my employer to replace BigDBMS with PostgreSQL
> for at least some of our Java applications.  As a proof of concept, I
> started with a high-volume (but conceptually simple) network data
> collection application.  This application collects files of 5-minute
> usage statistics from our network devices, and stores a raw form of
> these stats into one table and a normalized form into a second table.
> We are currently storing about 12 million rows a day in the normalized
> table, and each month we start new tables.  For the normalized data, the
> app inserts rows initialized to zero for the entire current day first
> thing in the morning, then throughout the day as stats are received,
> executes updates against existing rows.  So the app has very high update
> activity.
>
> In my test environment, I have a dual-x86 Linux platform running the
> application, and an old 4-CPU Sun Enterprise 4500 running BigDBMS and
> PostgreSQL 8.2.0 (only one at a time.)  The Sun box has 4 disk arrays
> attached, each with 12 SCSI hard disks (a D1000 and 3 A1000, for those
> familiar with these devices.)  The arrays are set up with RAID5.  So I'm
> working with a consistent hardware platform for this comparison.  I'm
> only processing a small subset of files (144.)
>
> BigDBMS processed this set of data in 20000 seconds, with all foreign
> keys in place.  With all foreign keys in place, PG took 54000 seconds to
> complete the same job.  I've tried various approaches to autovacuum
> (none, 30-seconds) and it doesn't seem to make much difference.  What
> does seem to make a difference is eliminating all the foreign keys; in
> that configuration, PG takes about 30000 seconds.  Better, but BigDBMS
> still has it beat significantly.
>
> I've got PG configured so that that the system database is on disk array
> 2, as are the transaction log files.  The default table space for the
> test database is disk array 3.  I've got all the reference tables (the
> tables to which the foreign keys in the stats tables refer) on this
> array.  I also store the stats tables on this array.  Finally, I put the
> indexes for the stats tables on disk array 4.  I don't use disk array 1
> because I believe it is a software array.
>
> I'm out of ideas how to improve this picture any further.  I'd
> appreciate some suggestions.  Thanks.
>
> --
> Guy Rouillier
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings
>

In response to

pgsql-performance by date

Next:From: Guy RouillierDate: 2006-12-31 07:26:31
Subject: Re: High update activity, PostgreSQL vs BigDBMS
Previous:From: Alvaro HerreraDate: 2006-12-29 22:07:47
Subject: Re: High update activity, PostgreSQL vs BigDBMS

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group