Re: tuning questions

From: Jack Coates <jack(at)lyris(dot)com>
To: pgsql-performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: tuning questions
Date: 2003-12-06 01:22:42
Message-ID: 1070673761.13542.534.camel@cletus.lyris.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

On Fri, 2003-12-05 at 09:26, Josh Berkus wrote:
> Jack,
>
> > The frustrating thing is, we also have a UP P3-500 with 512M RAM and two
> > IDE drives with the same PG install which is doing okay with this load
> > -- still half the speed of MS-SQL2K, but usable. I'm at a loss.
>
> Overall, I'm really getting the feeling that this procedure was optimized for
> Oracle and/or MSSQL and is hitting some things that aren't such a good idea
> for PostgreSQL. I highly suggest that you try using log_duration and
> log_statement (and in 7.4 log_min_duration_statement) to try to locate which
> particular statements are taking the longest.

I'll definitely buy that as round two of optimization, but round one is
still "it's faster on the slower server."

hdparm -I is identical between the boxes, filesystem structure layout is
identical, disk organization isn't identical, but far worse: the UP low
ram box has PG on /dev/hdb, ew. Predictably, vmstat shows low numbers...
but steady numbers.

dev is the box which goes fast, and I was wrong, it's actually a 2GHz
P4. rufus is the box which goes slow. During the big fetch:
dev bi sits around 2000 blocks for twenty seconds while bo is around 50
blocks, then bo jumps to 800 or so while the data is returned, then
we're done.

rufus bi starts at 16000 blocks, then drops steadily while bo climbs.
After a minute or so, bi stabilizes at 4096 blocks, then bo bursts to
return the data. Then the next fetch starts, and it's bi of 500, bo of
300 for several minutes.

These observations certainly all point to Eric and Thierry's
recommendations to better organize the filesystem and get faster disks..
except that the dev box gets acceptable performance.

So, I've dug into postgresql.conf on dev and rufus, and here's what I
found:

RUFUS

how much
ram do
you
have?

75%
converted to 8K pages of that for effective_cache

15% of
that or
512M,
whichever is larger, converted to 8K pages for shared_buffers
15% of
that
converted to 8K pages for vacuum_mem

how many
messages
will you
send
between
vacuums?

divide
that by
2 and
divide
by 6 for
max_fsm_pages

DEV

how much
ram do
you
have?

48%
converted to 8K pages of that for effective_cache

6.5% of
that or
512M,
whichever is larger, converted to 8K pages for shared_buffers
52% of
that
converted to 8K pages for vacuum_mem

max_fsm_pages untouched on this box.

I adjusted rufus's configuration to match those percentages, but left
max_fsm_pages dialed up to 500000. Now Rufus's vmstat shows much better
behavior: bi 12000 blocks gradually sloping down to 3000 during the big
select, bo steady until it's ready to return. As more jobs come in, we
see overlap areas where bi is 600-ish and bo is 200-ish, but they only
last a few tens of seconds.

The big selects are still a lot slower than they are on the smaller
database and overall performance is still unacceptable. Next I dialed
max_fsm_pages back down to 10000 -- no change. Hm, maybe it's been too
long since the last vacuumdb --analyze, let's give it another.

hdparm -Tt shows that disk performance is crappo on rufus, half what it
is on dev -- and freaking dev is using 16 bit IO! This is a motherboard
IDE controller issue.

South Bridge: VIA vt8233
Revision: ISA 0x0 IDE 0x6

That's it, I'm throwing out this whole test series and starting over
with different hardware. Database server is now a dual 2GHz Xeon with
2GB RAM & 2940UW SCSI, OS and PG's logs on 36G drive, PG data on 9GB
drive. Data is importing now and I'll restart the tests tonight.
--
Jack Coates, Lyris Technologies Applications Engineer
510-549-4350 x148, jack(at)lyris(dot)com
"Interoperability is the keyword, uniformity is a dead end."
--Olivier Fourdan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joe Conway 2003-12-06 01:53:03 Re: 7.4.1 ... slight change of scheduale ...
Previous Message Alvar Freude 2003-12-06 01:14:11 Re: bytea, index and like operator again and detailed

Browse pgsql-performance by date

  From Date Subject
Next Message Robert Treat 2003-12-06 02:54:52 Re: Help tracking down problem with inserts slowing down...
Previous Message Neil Conway 2003-12-05 23:38:47 Re: Help tracking down problem with inserts slowing