Re: High update activity, PostgreSQL vs BigDBMS

From: "Alex Turner" <armtuk(at)gmail(dot)com>
To: "Guy Rouillier" <guyr-ml1(at)burntmail(dot)com>
Cc: "PostgreSQL Performance" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: High update activity, PostgreSQL vs BigDBMS
Date: 2006-12-30 03:22:35
Message-ID: 33c6269f0612291922s2a784533n9e6652af79158da1@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

You should search the archives for Luke Lonegran's posting about how IO in
Postgresql is significantly bottlenecked because it's not async. A 12 disk
array is going to max out Postgresql's max theoretical write capacity to
disk, and therefore BigRDBMS is always going to win in such a config. You
can also look towards Bizgres which allegedly elimates some of these
problems, and is cheaper than most BigRDBMS products.

Alex.

On 12/28/06, Guy Rouillier <guyr-ml1(at)burntmail(dot)com> wrote:
>
> I don't want to violate any license agreement by discussing performance,
> so I'll refer to a large, commercial PostgreSQL-compatible DBMS only as
> BigDBMS here.
>
> I'm trying to convince my employer to replace BigDBMS with PostgreSQL
> for at least some of our Java applications. As a proof of concept, I
> started with a high-volume (but conceptually simple) network data
> collection application. This application collects files of 5-minute
> usage statistics from our network devices, and stores a raw form of
> these stats into one table and a normalized form into a second table.
> We are currently storing about 12 million rows a day in the normalized
> table, and each month we start new tables. For the normalized data, the
> app inserts rows initialized to zero for the entire current day first
> thing in the morning, then throughout the day as stats are received,
> executes updates against existing rows. So the app has very high update
> activity.
>
> In my test environment, I have a dual-x86 Linux platform running the
> application, and an old 4-CPU Sun Enterprise 4500 running BigDBMS and
> PostgreSQL 8.2.0 (only one at a time.) The Sun box has 4 disk arrays
> attached, each with 12 SCSI hard disks (a D1000 and 3 A1000, for those
> familiar with these devices.) The arrays are set up with RAID5. So I'm
> working with a consistent hardware platform for this comparison. I'm
> only processing a small subset of files (144.)
>
> BigDBMS processed this set of data in 20000 seconds, with all foreign
> keys in place. With all foreign keys in place, PG took 54000 seconds to
> complete the same job. I've tried various approaches to autovacuum
> (none, 30-seconds) and it doesn't seem to make much difference. What
> does seem to make a difference is eliminating all the foreign keys; in
> that configuration, PG takes about 30000 seconds. Better, but BigDBMS
> still has it beat significantly.
>
> I've got PG configured so that that the system database is on disk array
> 2, as are the transaction log files. The default table space for the
> test database is disk array 3. I've got all the reference tables (the
> tables to which the foreign keys in the stats tables refer) on this
> array. I also store the stats tables on this array. Finally, I put the
> indexes for the stats tables on disk array 4. I don't use disk array 1
> because I believe it is a software array.
>
> I'm out of ideas how to improve this picture any further. I'd
> appreciate some suggestions. Thanks.
>
> --
> Guy Rouillier
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Guy Rouillier 2006-12-31 07:26:31 Re: High update activity, PostgreSQL vs BigDBMS
Previous Message Alvaro Herrera 2006-12-29 22:07:47 Re: High update activity, PostgreSQL vs BigDBMS