Re: Slow Vacuum was: vacuum output question

From: "Scott Marlowe" <scott(dot)marlowe(at)gmail(dot)com>
To: "Dan Armbrust" <daniel(dot)armbrust(dot)list(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Slow Vacuum was: vacuum output question
Date: 2009-01-09 02:23:24
Message-ID: dcc563d10901081823y5247cdf0q41f7e455423b76c1@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, Jan 8, 2009 at 10:43 AM, Dan Armbrust
<daniel(dot)armbrust(dot)list(at)gmail(dot)com> wrote:
> On PostgreSQL 8.1, while a long vacuum is running, the output of
> vmstat 10 looks like this (sorry, can't format this very will in this
> e-mail client):
>
> r b swpd free buff cache si so bi bo
> in cs us sy id wa st
> 5 2 112 53732 4388 1163404 0 0 13524 1322 941 75053 45
> 55 0 0 0
> 3 2 112 51596 2336 1160032 0 0 20731 732 999 196116 35
> 58 0 7 0
> 3 2 112 53784 2448 1155388 0 0 5729 1680 716 114442 46
> 52 0 2 0
> 7 1 112 52088 2568 1156796 0 0 205 1919 546 28929 56
> 44 0 0 0
> 3 2 112 51772 2652 1157028 0 0 162 2010 534 2845 56 44 0 0 0
> 3 2 112 52296 2736 1156332 0 0 174 1910 517 2648 53 47 0 0 0
> 2 2 112 52736 2832 1155388 0 0 173 1960 521 2698 57 43 0 0 0
> 6 1 112 51172 2908 1157400 0 0 173 1825 511 2621 52 48 0 0 0
> 2 1 112 52156 2984 1155876 0 0 152 1766 495 2669 53 47 0 0 0
> 5 2 112 53828 3064 1154028 0 0 124 1762 498 2673 54 46 0 0 0
> 5 0 112 53396 3140 1154684 0 0 122 1803 532 2871 52 48 0 0 0
> 6 1 112 52004 3240 1155892 0 0 150 2004 522 3712 56 44 0 0 0
> 5 2 112 53100 3348 1152364 0 0 128 2142 543 2711 54 38 0 8 0
> 1 2 112 56704 3428 1148132 0 0 102 2229 560 2925 56 44 0 0 0
> 2 1 112 54876 3500 1150080 0 0 91 2129 598 3068 65 35 0 0 0
> 3 1 112 53196 3588 1152164 0 0 90 2091 542 2803 57 43 0 0 0
> 8 2 112 53528 3664 1151868 0 0 106 2180 553 2947 55 39 0 6 0
>
> Vacuum hasn't stopped yet at this point - its been running for many minutes.
>
>
> On PostgreSQL 8.3, we had this:
>
> r b swpd free buff cache si so bi bo
> in cs us sy id wa st
> 6 0 9604 51932 14276 1138996 0 0 3 1808 523 5115 45
> 9 42 4 0
> 2 0 9580 53284 14124 1138092 10 0 564 2561 585 5126 53
> 10 19 18 0
> 7 0 9564 53412 14144 1137696 0 0 0 1682 497 4985 42
> 9 48 2 0
> 3 0 9532 53320 13880 1137940 6 0 37 1757 538 4979 48 8 41 3
>
> More random data:
>
> On an overnight test of PostgreSQL 8.3 on Fedora Core 6, with
> auto-vacuum on (using all default settings) the amount of time that it
> takes to run a manual vacuum on a constant sized database (only row
> updates, or deletes followed by a replacement add) has doubled from 2
> seconds to 4 seconds. A reindex brings the time back down to 2
> seconds.

Well, your throughput on this machine is horrible. It looks like with
8.1 all your time is sys + cpu for your cpus, while with 8.3 you've
got more idle and more i/o wait, which tells me that 8.3 is smarter
about vacuuming, so it's spending less time working the cpus and more
time waiting for the i/o subsystem.

Wither way, getting only 2 or so megs a second write is pretty bad. I
can get those numbers from a laptop. An older laptop like a single
core 1.6GHz pentium M based T42 or something. My laptop, which is new
from last year, is about twice as fast as your server in terms of I/O.
I'm running 8.3.5 btw. Here's vmstat 10 output while vacuuming a
db:

procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
0 1 45104 27228 26876 1806996 0 0 132 3545 1009 4531 17 4 50 29
0 1 45104 31680 26804 1803040 0 0 108 2144 1012 2187 15 3 39 44
7 0 45104 27120 26916 1807800 0 0 103 4439 1057 5018 17 4 37 42
2 1 45104 25732 26972 1808868 0 0 92 4174 1049 4439 16 4 37 43
0 1 45104 32928 26988 1801116 0 0 97 3709 1029 3254 17 3 34 45

pgbench gives me numbers like this:
0 1 45116 29920 26772 1810308 0 0 132 4540 932 2100 16 3 41 40
1 2 45116 31956 26732 1807844 0 0 380 8864 897 1824 12 4 32 52

You could replace that machine with something much faster for the
price of an upgrade or two on the dell. I spent a lot of effort in
the last year trying to increase the performance of our dell 1950s and
finally gave up and just replaced them as db servers. One decent
sized dual QC opteron can run the same load as 5 other database
servers while running under a much lower load and better performance
than any of the 5 dbs it replaced.

We had bought a 1950 last year with an MD-1000 before I started, and
it had cost about $12k in total for 7 SAS drives, controller and the
original 1950. The new db server to replace it cost $500 less, had 16
SAS drives, a much faster Areca 1680 RAID controller, and 8 opteron
cores. It is literally 10 to 40 times faster depending on the
workload.

The 1850s we have are half as fast as our 1950s as db servers, so I'm
pretty sure you could run faster with only a handful of SAS drives or
intel xm25s.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2009-01-09 05:21:19 Re: Improving compressibility of WAL files
Previous Message Bruce Momjian 2009-01-09 00:36:13 Re: Improving compressibility of WAL files