Re: Postmaster using only 4-5% CPU

From: Edoardo Serra <osdevel(at)webrainstorm(dot)it>
To: Scott Marlowe <smarlowe(at)g2switchworks(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Postmaster using only 4-5% CPU
Date: 2006-03-23 09:14:24
Message-ID: 7.0.0.16.2.20060322120000.0796c930@webrainstorm.it
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

At 18.44 21/03/2006, Scott Marlowe wrote:
>Here's what's happening. On the "fast" machine, you are almost
>certainly using IDE drives.

Oh yes, the fast machine has IDE drives, you got it ;)

>Meanwhile, back in the jungle... The machine with IDE drives operates
>differently. Most, if not all, IDE drives, when told by the OS to
>fsync() tell the OS immediately that the fsync() call has completed, and
>the data is written to the drive. Shortly thereafter, the drive
>actually commences to write the data out. When it gets a chance.

I really didn't know this behaviour of IDE drives.
I was stracing the postmaster while investigating the problem and noticed
many fsync syscalls (one after each INSERT).

I was investigating on it but I didn't explain me why SCSI was slower.
You helped me a lot ;) tnx

>For PostgreSQL, the way IDE drives operate is dangerous. Write data
>out, call fsync(), get an immediate return, mark the data as committed,
>move on the next operation, operator trips over power cord / power
>conditioner explodes, power supply dies, brown out causes the machine to
>reboot, et. al., and when the machine comes up, PostgreSQL politely
>informs you that your database is corrupt, and you come to the
>pgsql-general group asking how to get your database back online. Very
>bad.

Yes, it sounds very bad... what about SATA drives ?
I heard about command queueing in SATA but I don't know if the kernel
handles it properly

>Try wrapping the inserts in the sql file in begin; / commit; statements,
>like so:
>
>begin;
>insert into table ...
>(100,000 inserts here)
>insert into table ...
>commit;
>
>and it should fly.

Oh, yes with the insert wrapped in a transaction the import time is as follows:
- SCSI: 35 secs
- IDE: 50 secs

>When a good friend of mine first started using PostgreSQL, he was a
>total MySQL bigot. He was importing a 10,000 row dataset, and made a
>smartassed remark after 10 minutes how it would have imported in minutes
>on MySQL. It was a test database, so I had him stop the import, delete
>all the imported rows, and wrap the whole import inside begin; and
>commit;
>
>The import took about 20 seconds or so.

;)

>Now, for the interesting test. Run the import on both machines, with
>the begin; commit; pairs around it. Halfway through the import, pull
>the power cord, and see which one comes back up. Don't do this to
>servers with data you like, only test machines, obviously. For an even
>more interesting test, do this with MySQL, Oracle, DB2, etc...

I will surely run a test like this ;)

Tnx a lot again for help

Regards

Edoardo Serra

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Theo Kramer 2006-03-23 11:09:49 Re: Indexes with descending date columns
Previous Message Jojo Paderes 2006-03-23 06:19:24 Scaling up PostgreSQL in Multiple CPU / Dual Core Powered Servers