Skip site navigation (1) Skip section navigation (2)

Re: New hardware thoughts

From: Dave Cramer <pg(at)fastcrypt(dot)com>
To: Ben Suffolk <ben(at)vanilla(dot)net>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: New hardware thoughts
Date: 2006-10-20 14:58:10
Message-ID: 463B9F43-74FC-4DFE-ABC4-357228901174@fastcrypt.com (view raw or flat)
Thread:
Lists: pgsql-performance
Ben,

On 20-Oct-06, at 3:49 AM, Ben Suffolk wrote:

> Hello all,
>
> I am currently working out the best type of machine for a high  
> volume pgsql database that I going to need for a project. I will be  
> purchasing a new server specifically for the database, and it won't  
> be running any other applications. I will be using FreeBSD 6.1 Stable.
>
> I think it may be beneficial if I give a brief overview of the  
> types of database access. There are several groups of tables and  
> associated accesses to them.
>
> The first can be thought of as users details and configuration  
> tables. They will have low read and write access (say around 10 -  
> 20 a min). SIzed at around 1/2 Million rows.
>
> The second part is logging, this will be used occasionally for  
> reads when reports are run, but I will probably back that off to  
> more aggregated data tables, so can probably think of this as a  
> write only tables. Several table will each have around 200-300  
> inserts a second. The can be archived on a regular basis to keep  
> the size down, may be once a day, or once a week. Not sure yet.
>
> The third part will be transactional and will have around 50  
> transaction a second. A transaction is made up of a query followed  
> by an update, followed by approx 3 inserts. In addition some of  
> these tables will be read out of the transactions at approx once  
> per second.
>
> There will be around 50 simultaneous connections.
>
> I hope that overview is a) enough and b) useful background to this  
> discussion.
>
> I have some thoughts but I need them validating / discussing. If I  
> had the money I could buy the hardware and sped time testing  
> different options, thing is I need to get this pretty much right on  
> the hardware front first time. I'll almost certainly be buying Dell  
> kit, but could go for HP as an alternative.
>
> Processor : I understand that pgsql is not CPU intensive, but that  
> each connection uses its own process. The HW has an option of upto  
> 4 dual core xeon processors. My thoughts would be that more lower  
> spec processors would be better than fewer higher spec ones. But  
> the question is 4 (8 cores) wasted because there will be so much  
> blocking on I/O. Is 2 (4 cores) processors enough. I was thinking 2  
> x 2.6G dual core Xeons would be enough.
>
> Memory : I know this is very important for pgsql, and the more you  
> have the more of the tables can reside in memory. I was thinking of  
> around 8 - 12G, but the machine can hold a lot more. Thing is  
> memory is still quite expensive, and so I don't to over spec it if  
> its not going to get used.
>
> Disk : Ok so this is the main bottleneck of the system. And the  
> thing I know least about, so need the most help with. I understand  
> you get good improvements if you keep the transaction log on a  
> different disk from the database, and that raid 5 is not as good as  
> people think unless you have lots of disks.
>
> My option in disks is either 5 x 15K rpm disks or 8 x 10K rpm disks  
> (all SAS), or if I pick a different server I can have 6 x 15K rpm  
> or 8 x 10K rpm (again SAS). In each case controlled by a PERC 5/i  
> (which I think is an LSI Mega Raid SAS 8408E card).
>
You mentioned a "Perc" controller, so I'll assume this is a Dell.

My advice is to find another supplier. check the archives for Dell.

Basically you have no idea what the Perc controller is since it is  
whatever Dell decides to ship that day.

In general though you are going down the right path here. Disks  
first, memory second, cpu third

Dave

> So the question here is will more disks at a slower speed be better  
> than fewer disks as a higher speed?
>
> Assuming I was going to have a mirrored pair for the O/S and  
> transaction logs that would leave me with 3 or 4 15K rpm for the  
> database, 3 would mean raid 5 (not great at 3 disks), 4 would give  
> me raid 10 option if I wanted it.  Or I could have raid 5 across  
> all 5/6 disks and not separate the transaction and database onto  
> different disks. Better performance from raid 5 with more disks,  
> but does having the transaction logs and database on the same disks  
> counteract / worsen the performance?
>
> If I had the 8 10K disks, I could have 2 as a mirrored pair for O/S  
> Transaction, and still have 6 for raid 5. But the disks are slower.
>
> Anybody have any good thoughts on my disk predicament, and which  
> options will serve me better.
>
> Your thoughts are much appreciated.
>
> Regards
>
> Ben
>
>
>
>
>
>
>
> ---------------------------(end of  
> broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>       choose an index scan if your joining column's datatypes do not
>       match
>


In response to

Responses

pgsql-performance by date

Next:From: Ben SuffolkDate: 2006-10-20 18:22:18
Subject: Re: New hardware thoughts
Previous:From: Shane AmblerDate: 2006-10-20 14:42:59
Subject: Re: New hardware thoughts

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group