Re: testing HS/SR - 1 vs 2 performance

From: "Erik Rijkers" <er(at)xs4all(dot)nl>
To: "Greg Smith" <greg(at)2ndquadrant(dot)com>
Cc: "Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Robert Haas" <robertmhaas(at)gmail(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: testing HS/SR - 1 vs 2 performance
Date: 2010-05-04 19:40:12
Message-ID: 64e9143b80c9c450464f6a6edf461032.squirrel@webmail.xs4all.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, May 4, 2010 20:26, Greg Smith wrote:
> Erik Rijkers wrote:
>> OS: Centos 5.4
>> 2 quadcores: Intel(R) Xeon(R) CPU X5482 @ 3.20GHz
>> Areca 1280ML
>> primary and standby db both on a 12 disk array (sata 7200rpm, Seagat Barracuda ES.2)
>>
>
> To fill in from data you already mentioned upthread:
> 32 GB RAM
> CentOS release 5.4 (Final), x86_64 Linux, 2.6.18-164.el5
>
> Thanks for the all the reporting you've done here, really helpful.
> Questions to make sure I'm trying to duplicate the right thing here:
>
> Is your disk array all configured as one big RAID10 volume, so
> essentially a 6-disk stripe with redundancy, or something else? In
> particular I want know whether the WAL/database/archives are split onto
> separate volumes or all on one big one when you were testing.

Everything together: the raid is what Areca call 'raid10(1E)'.
(to be honest I don't remember what that 1E exactly means -
extra flexibility in the number of disks, I think).

Btw, some of my emails contained the postgresql.conf of both instances.

>
> Is this is on ext3 with standard mount parameters?

ext3 noatime

> Also, can you confirm that every test you ran only had a single pgbench
> worker thread (-j 1 or not specified)? That looked to be the case from
> the ones I saw where you posted the whole command used. It would not

yes; the literal cmd:
time /var/data1/pg_stuff/pg_installations/pgsql.sr_primary/bin/pgbench -h /tmp -p 6565 -U rijkers
-n -S -c 20 -T 900 -j 1 replicas

To avoid wrapping in the emails I just removed '-h \tmp', -U rijkers', and 'replicas'.

(I may have run the primary's pgbench binary also against the slave - don't think
that should make any difference)

> surprise me to find that the CPU usage profile of a standby is just
> different enough from the primary that it results in the pgbench program
> not being scheduled enough time, due to the known Linux issues in that
> area. Not going to assume that, of course, just one thing I want to
> check when trying to replicate what you've run into.
>
> I didn't see any glaring HS performance issues like you've been
> reporting on last time I tried performance testing in this area, just a
> small percentage drop. But I didn't specifically go looking for it

Here, it seems repeatable, but does not occur with all scales.

Hm, maybe I should just dump *all* of my results on the wiki for reference. (I'll look at that
later).

> either. With your testing rig out of service, we're going to try and
> replicate that on a system here. My home server is like a scaled down
> version of yours (single quad-core, 8GB RAM, smaller Areca controller, 5
> disks instead of 12) and it's running the same CentOS version. If the
> problems really universal I should see it here too.
>

Thanks,

Erik Rijkers

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2010-05-04 19:46:47 Re: including PID or backend ID in relpath of temp rels
Previous Message Stefan Kaltenbrunner 2010-05-04 19:34:53 Re: testing HS/SR - 1 vs 2 performance