From: | "Johnson, Shaunn" <SJohnson6(at)bcbsm(dot)com> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | performance enhancements for PostgreSQL |
Date: | 2002-11-18 19:21:04 |
Message-ID: | 73309C2FDD95D11192E60008C7B1D5BB04C74253@snt452.corp.bcbsm.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Howdy:
I have a Linux server running PostgreSQL 7.2.1. with about 1 Gig of
memory. The proc speed is about 1.14 GHz.
I'm getting more and more concerned about how often
the database gets used and the days (like today) where I
wonder if buying more memory rather than buying a 2nd CPU
was such a great idea.
In an effort to enhance / streamline performance, I've done
the following:
* memory upgrade from 512M to 1G
* move RAID5 to scsi drives (10K RPM)
* set up cron script to vacuum database weekly
* set number of client connects ( i.e., /usr/bin/postmaster -i -B 128 -N 64
-d 4 )
But I'm at the point now that I can't kill some jobs. Yes, I know
I shouldn't use 'kill' in any forceful way, but just a kill seems to do
nothing (or, if it is doing something, it's not fast enough
for the user community and it's stopping production).
I got this from a co-worker:
[snip]
the server is blocking on access to the metadata tables. not even
logins are being processed. I'm not sure what caused the
problem, but I think a database restart is the best course,
which I have been trying to do. Do not kill -9, as it will corrupt the
WAL.
[/snip]
I see some errors in the messages file regarding the RAID drives
(the filesystem where the database lives)
[snip error]
Nov 18 12:33:51 hmp kernel: scsi : aborting command due to timeout : pid 0,
scsi
2, channel 2, id 0, lun 1 Write (10) 20 11 d2 76 9f 00 00 40 00
Nov 18 12:33:51 hmp kernel: scsi : aborting command due to timeout : pid 0,
scsi
2, channel 2, id 0, lun 1 Write (10) 20 11 d2 76 df 00 00 08 00
Nov 18 12:33:51 hmp kernel: scsi : aborting command due to timeout : pid 0,
scsi
2, channel 2, id 0, lun 1 Write (10) 20 11 d2 76 ef 00 00 08 00
Nov 18 12:33:52 hmp kernel: scsi : aborting command due to timeout : pid 0,
scsi
2, channel 2, id 0, lun 1 Write (10) 20 11 d2 77 27 00 00 40 00
Nov 18 12:33:52 hmp kernel: scsi : aborting command due to timeout : pid 0,
scsi
2, channel 2, id 0, lun 1 Write (10) 20 11 d2 77 67 00 00 40 00
Nov 18 12:33:53 hmp kernel: scsi : aborting command due to timeout : pid 0,
scsi
[/snip error]
my questions are:
* have can I figure out how access to the metadata tables
are being stopped? (my guess is the error on the scsi drive, but ... )
* how to restart PostgreSQL without running the risk of corrupting data?
* what are the benefits to adding a 2nd CPU over, say, more memory?
I'm sorry that I don't have enough information at this time ... I'm getting
swamped by users as I type this.
Thanks!
-X
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2002-11-18 19:23:01 | Re: [GENERAL] DECLARE CURSOR |
Previous Message | Gregory Seidman | 2002-11-18 19:19:31 | Re: More time manipulation.. |