Quick Links

Re: Problem with pgstat timeouts

From:	"Benjamin Krajmalnik" <kraj(at)servoyant(dot)com>
To:	"Benjamin Krajmalnik" <kraj(at)servoyant(dot)com>, "pgsql-admin" <pgsql-admin(at)postgresql(dot)org>
Subject:	Re: Problem with pgstat timeouts
Date:	2012-01-04 17:54:52
Message-ID:	F4E6A2751A2823418A21D4A160B689888CA623@fletch.stackdump.local
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-admin

I have proceeded to do some more checking, and I see in iostat that the pg_xlog drive has a significantly higher busy state than before. Whereas it was barely busy when we first spun up the server (total %busy since we started the server is about 6%) it is now in its 80's almost steady state. We have a set of partitioned tables which are continuously updated, and based on the size of them they no longer fit in the shared memory which was allocated. Pg_xlog is in a SAS RAID 1. The server is set up with streaming replication to an identical server. The one thing which I just checked is the RAID mode on the server.
db1# megacli64 -LDInfo -lAll -aAll

Adapter 0 -- Virtual Drive Information:
Virtual Drive: 0 (Target Id: 0)
Name :
RAID Level : Primary-1, Secondary-0, RAID Level Qualifier-0
Size : 2.452 TB
State : Optimal
Strip Size : 256 KB
Number Of Drives per span:2
Span Depth : 6
Default Cache Policy: WriteBack, ReadAhead, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteThrough, ReadAhead, Direct, No Write Cache if Bad BBU
Access Policy : Read/Write
Disk Cache Policy : Disk's Default
Encryption Type : None

Virtual Drive: 1 (Target Id: 1)
Name :
RAID Level : Primary-1, Secondary-0, RAID Level Qualifier-0
Size : 418.656 GB
State : Optimal
Strip Size : 256 KB
Number Of Drives : 2
Span Depth : 1
Default Cache Policy: WriteBack, ReadAhead, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteThrough, ReadAhead, Direct, No Write Cache if Bad BBU
Access Policy : Read/Write
Disk Cache Policy : Disk's Default
Encryption Type : None

Is it possible that the WriteThrough is what is causing the high io (and maybe the pgstat wait timeouts as well)?
If this is the case, would it be safe to change the cache to Write back?

Additionally, and somewhat unrelated, is there anything special which I need to do when restarting the primary server vis-à-vis the streaming replication server? In other words, if I were to restart the main server, will the streaming replication server reconnect and pick up once the primary comes online?

In response to

Problem with pgstat timneouts at 2011-12-22 16:32:51 from Benjamin Krajmalnik

Responses

Re: Problem with pgstat timeouts at 2012-01-04 18:43:59 from Kevin Grittner

Browse pgsql-admin by date

	From	Date	Subject
Next Message	Tom Lane	2012-01-04 18:14:04	Re: Cannot restore dumps made with -Fc and --column-inserts
Previous Message	Tom Lane	2012-01-04 16:33:20	Re: List archives dead?