Re: AIX slow buffer reads

From: André Volpato <andre(dot)volpato(at)ecomtecnologia(dot)com(dot)br>
To: Brad Nicholson <bnichols(at)ca(dot)afilias(dot)info>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: AIX slow buffer reads
Date: 2010-10-27 17:05:09
Message-ID: 1141414014.10692.1288199109474.JavaMail.root@zimbra01a
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


----- Mensagem original -----
| On 10-10-26 05:04 PM, André Volpato wrote:
| > ----- Mensagem original -----
| > | On 10-10-25 03:26 PM, André Volpato wrote:
| > |> | On Mon, Oct 25, 2010 at 2:21 PM, André Volpato
| > |> |<andre(dot)volpato(at)ecomtecnologia(dot)com(dot)br> wrote:
| >
| > (...)
| >
| > |> |> These times keep repeating after the second run, and I can
| > |> |> ensure AIX isn´t touching the disks anymore.
| > |> |> I´ve never seen this behaviour before. I heard about Direct
| > |> |> I/O
| > |> |> and I was thinking about givng it a shot.
| > |> |>
| > |> |> Any ideas?
| > |> |>
| > |> |
| > |> | I doubt disk/io is the problem.
| > |>
| > |> Me either.
| > |> Like I said, AIX do not touch the storage when runing the query.
| > |> It became CPU-bound after data got into cache.
| > |
| > | Have you confirmed that the hardware is ok on both servers?
| > |
| >
| > The hardware was recently instaled and checked by the vendor team.
| > AIX box is on JS22:
| > PostgreSQL 8.4.4, AIX 5.3-9 64bits, SAN IBM DS3400, 8x450GB SAS 15K
| > Raid-5
| > 8GB RAM (DDR2 667)
| >
| > # lsconf
| > System Model: IBM,7998-61X
| > Processor Type: PowerPC_POWER6
| > Processor Implementation Mode: POWER 6
| > Processor Version: PV_6
| > Number Of Processors: 4
| > Processor Clock Speed: 4005 MHz
| > CPU Type: 64-bit
| > Kernel Type: 64-bit
| > Memory Size: 7680 MB
| >
| > Debian box is on HS21:
| > PostgreSQL 8.4.4, Debian 4.3.2 64bits, SAN IBM DS3400, 5x300GB SAS
| > 15K Raid-0
| > 7GB RAM (DDR2 667)
| > We are forced to use RedHat on this machine, so we are virtualizing
| > the Debian box.
| >
| > # cpuinfo
| > processor : [0-7]
| > vendor_id : GenuineIntel
| > cpu family : 6
| > model : 23
| > model name : Intel(R) Xeon(R) CPU E5420 @ 2.50GHz
| > stepping : 6
| > cpu MHz : 2500.148
| > cache size : 6144 KB
| >
| >
| >
| > | Have both OS's been tuned by people that know how to tune the
| > | respective OS's? AIX is very different than Linux, and needs to be
| > | tuned
| > | accordingly.
| >
| > We´ve been tuning AIX for the last 3 weeks, and lots of tuneables
| > got changed.
| > On Debian, we have far more experience, and it´s been a chalenge to
| > understand how AIX works.
| >
| > Most important tunes:
| > page_steal_method=1
| > lru_file_repage=0
| > kernel_heap_psize=64k
| > maxperm%=90
| > maxclient%=90
| > minperm%=20
| >
| > Disk:
| > chdev -l hdisk8 -a queue_depth=24
| > chdev -l hdisk8 -a reserve_policy=no_reserve
| > chdev -l hdisk8 -a algorithm=round_robin
| > chdev -l hdisk8 -a max_transfer=0x400000
| >
| > HBA:
| > chdev -l fcs0 -P -a max_xfer_size=0x400000 -a num_cmd_elems=1024
| >
| > Postgres:
| > shared_buffers = 2304MB
| > effective_io_concurrency = 5
|
| I wonder if effective_io_concurrency has anything to do with it. It
| was
| implemented and mainly tested on Linux, and I am unsure if it will do
| anything on AIX. The plan you posted for the query does a bitmap index
| scans which is what effective_io_concurrency will speed up.
|
| Can you post the output of explain analyze for that query on both AIX
| and Linux? That will show where the time is being spent.

I changed the querys in order to make a more valuable comparison.

Debian first run (23s):
http://explain.depesz.com/s/1fT

AIX first run (40s):
http://explain.depesz.com/s/CRG

Debian cached consecutive runs (8s)
http://explain.depesz.com/s/QAi

AIX cached consecutive runs (12s)
http://explain.depesz.com/s/xJU

Both boxes are runing with DDR2 667, so RAM speed seems to be the bootleneck now.
We´re about to try RedHat EL6 in the next few days.

|
| If it is being spent in the bitmap index scan, try setting
| effective_io_concurrency to 0 for Linux, and see what effect that has.

I disabled effective_io_concurrency at AIX but it made no changes on bitmap index times.

| --
| Brad Nicholson 416-673-4106
| Database Administrator, Afilias Canada Corp.
|

[]´s, Andre Volpato

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Jon Nelson 2010-10-27 17:29:44 temporary tables, indexes, and query plans
Previous Message Francisco Reyes 2010-10-27 17:02:47 Re: Regression: 8.3 2 seconds -> 8.4 100+ seconds