Re: Scrub one large table against another (vmstat output)

From: Brendan Curran <brendan(dot)curran(at)gmail(dot)com>
To: AgentM <agentm(at)themactionfaction(dot)com>, Pg Performance list <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Scrub one large table against another (vmstat output)
Date: 2006-10-11 17:25:02
Message-ID: 452D28EE.6000105@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

>
> What prevents you from using an aggregate function?
>

I guess I could actually obtain the results in an aggregate function and use those to maintain a
summary table. There is a web view that requires 'as accurate as possible' numbers to be queried per
group (all 40 groups are displayed on the same page) and so constant aggregates over the entire
table would be a nightmare.

> Probably not 2x, but better performance than now. You probably don't
> want RAID 1, depending on your setup, many list member swear by RAID 10.
> Of course, your setup will depend on how much money you have to burn.
> That said, RAID 1 testing will allow you to determine the upper bounds
> of your hardware. Some folks say they get better performance with WAL
> off the main RAID, some keep it on. Only testing will allow you to
> determine what is optimal.

I will have to try moving WAL off those raid spindles, I have seen the posts regarding this.

> In the meantime, you need to identify the
> bottleneck of your operation. You should collect vmstat and iostat
> statistics for your present setup. Good luck!
>

I have to confess that I am a bit of a novice with vmstat. Below is a sample of my vmstat output
while running two scrubbing queries simultaneously:

machine:/dir# vmstat -S M 2
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
0 1 4 117 15 2962 0 0 100 25 96 107 2 0 86 11
0 3 4 117 15 2962 0 0 4884 1860 415 841 18 1 52 29
1 1 4 115 15 2964 0 0 2246 1222 462 394 8 0 51 41
0 2 4 114 14 2967 0 0 3932 2238 485 613 12 0 62 25
1 1 4 115 13 2966 0 0 3004 1684 507 609 8 0 60 31
0 3 4 116 13 2965 0 0 4688 4000 531 613 15 1 52 33
1 1 4 117 13 2964 0 0 2890 268 433 441 9 1 58 32
0 1 4 114 13 2968 0 0 2802 4708 650 501 8 1 64 28
0 2 4 114 13 2968 0 0 4850 1696 490 574 15 1 57 27
0 2 4 116 13 2966 0 0 4300 3062 540 520 13 1 61 26
0 2 4 115 13 2966 0 0 3292 3608 549 455 10 1 65 24
0 3 4 115 13 2966 0 0 4856 2098 505 564 15 1 59 26
0 3 4 115 13 2966 0 0 1608 2314 447 413 4 0 63 33
0 3 4 116 13 2966 0 0 6206 1664 442 649 18 1 52 29
1 1 4 115 13 2966 0 0 1886 1262 464 412 5 0 60 35
0 3 4 118 13 2964 0 0 2510 4138 571 493 7 1 64 28
1 1 4 117 13 2964 0 0 1632 56 325 373 5 0 53 42
0 3 4 116 13 2965 0 0 5358 3510 504 649 14 1 59 26
1 1 4 118 13 2964 0 0 2814 920 447 403 8 0 63 29

I know that wa is the time spent waiting on IO, but I lack a benchmark to determine just what I
should expect from my hardware (three 146GB U320 SCSI 10k drives in raid 5 on a Dell PERC4ei PE2850
controller). Those drives are dedicated completely to a /data mount that contains only
/data/postgresql/8.1/main. I have another two drives in raid 1 for everything else (OS, apps, etc.).
Can you give me any pointers based on that vmstat output?

Regards and Thanks,
Brendan

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Jim C. Nasby 2006-10-11 20:27:58 Collect stats during seqscan (was: Simple join optimized badly?)
Previous Message Brendan Curran 2006-10-11 16:53:41 Re: Scrub one large table against another