Re: how to configure my new server

From: "philip johnson" <philip(dot)johnson(at)atempo(dot)com>
To: "Josh Berkus" <josh(at)agliodbs(dot)com>, <pgsql-performance(at)postgresql(dot)org>
Subject: Re: how to configure my new server
Date: 2003-02-10 18:02:21
Message-ID: NDBBJLHHAKJFNNCGFBHLMECGELAA.philip.johnson@atempo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

pgsql-performance-owner(at)postgresql(dot)org wrote:
> Philip,
>
>>
>> someone could come back to first request ?
>>
>
> Insistent, aren't you? ;-)
>
>>> Yes no Raid, but will could change soon
>
> Adding RAID 1+0 could simplify your job enormously. It would
> prevent you from having to figure out what to put on each disk. If
> it were my machine, and I knew that the database was more important
> than the other services, I'd build it like this:
>
> Array 1: Disk 1: 18Go Ultra 3 (Ultra 160) SCSI 15 Ktpm
> Disk2 : 18Go Hot Plug Ultra 3 (Ultra 160) SCSI 15
> Ktpm
>
> Contains: Linux, Apache, Swap
>
> Array 2:
> Di:sk3 18Go Hot Plug Ultra 3 (Ultra 160) SCSI 15 Ktpm
> Disk4 18Go Hot Plug Ultra 3 (Ultra 160) SCSI 15 Ktpm
>
> Contains: PostgreSQL and databases
>
> Disk5 36Go Hot Plug Ultra 3 (Ultra 160) SCSI 15 Ktpm
>
> Contains: Postgresql log, backup partition.
>
> Alternately:
> Put all of the above on one *smart* RAID5 controller, with
> on-controller memory and battery. Might give you better performance
> considering your disk setup.
>
>>> how can I put indexes on a seperate disk ?
>
> Move the index object (use the oid2name package in /contrib to find
> the index) to a different location, and symlink it back to its
> original location. Make sure that you REINDEX at maintainence time,
> and don't drop and re-create the index, as that will have the effect
> of moving it back to the original location.
>
>>>>> linux values:
>>>>> kernel.shmmni = 4096
>>>>> kernel.shmall = 32000000
>>>>> kernel.shmmax = 256000000
>>> I took a look a the performance archive, and it's not possible to
>>> find real info on how to set these 3 values.
>
> Yeah. Personally, I just raise them until I stop getting error
> messages from Postgres. Perhaps someone on the list could speak to
> the danger of setting any of these values too high?
>
>>> I'm surprised that there's no spreadsheet to calculate those values.
>>> There are many threads, but it seems that no one is able to find a
>>> rule to define values.
>
> That's correct. There is no rule, because there are too many
> variables, and the value of many of those variables is a matter of
> opinion. As an
> *abbreviated* list:
> 1) Your processors and RAM; 2) Your drive setup and speed; 3) the
> frequency of data reads; 4) the frequency of data writes; 5) the
> average complexity of queries; 6) use of database procedures
> (functions) for DML; 7) your maintainence plan (e.g. how often can
> you run VACUUM FULL?); 8) the expected data population of tables
> (how many rows, how many tables); 9) your ability to program for
> indexed vs. non-indexed queries; 10) do you do mass data loads? ;
> 11) is the server being used for any other hihg-memory/networked
> applications? ; 12) the expected number of concurrent users; 13) use
> of large objects and/or large text fields; etc.
>
> As a result, a set of values that work really well for me might crash
> your database. It's an interactive process. Justin Clift started
> a project to create an automated interactive postgresql.conf tuner,
> one that would repeatedly test the speed of different queries against
> your database, overnight while you sleep. However, he didn't get
> very far and I haven't had time to help.
>
>>>> 1. How many queries does my database handle per second or minute?
>>>> can't say now
>
> This has a big influence on your desired sort_mem and shared_buffer
> settings. Make some estimates.
>
>>>>
>>>> 2. How big/complex are those queries?
>>>
>>> Not really complex and big as you can see
>
> OK, so nothing that would require you to really jack up your sort or
> shared memory beyond levels suggested by other factors. However, you
> don't say how many rows these queries usually return, which has a
> substantial effect on desired sort_mem.
>
> A good, if time-consuming, technique for setting sort_mem is to move
> it up and down (from, say 512 to 4096) seeing at what level your
> biggest meanest queries slow down noticably ... and then set it to
> one level just above that.
>
>>> There are more insert/update than read, because I'm doing table
>>> synchronization from an SQL Server database. Every 5 minutes I'm
>>> looking for change in SQL Server Database.
>>> I've made some stats, and I found that without user acces, and only
>>> with the replications I get 2 millions query per day
>
> In that case, making sure that your WAL files (the pg_xlog directory)
> is located on a seperate drive which *does nothing else* during
> normal operation is your paramount concern for performance. You'll
> also need to carefully prune your indexes down to only the ones you
> really need to avoid slowing your inserts and updates.
>
>>>> 4. What large tables in my database get queried
>>>> simultaneously/together? why this questions ?
>
> If you're not using RAID, it would affect whether you should even
> consider moving a particular table or index to a seperate drive. If
> you have two tables, each of which is 3 million records, and they are
> quried joined together in 50% of data reads, then one of those tables
> is a good candidate for moving to another drive.
>
> Good luck!

thanks very much

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message jeandre 2003-02-11 11:10:12
Previous Message Josh Berkus 2003-02-10 17:35:14 Re: how to configure my new server