Scott,

I can only answer a couple of the questions at the moment.  I had to kill the vacuum full and do a regular vacuum, so I can't get the iostat and vmstat outputs right now.  This message is the reason I was trying to run vacuum full:

INFO:  "license": found 257 removable, 20265895 nonremovable row versions in 1088061 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 18434951 unused item pointers.
687274 pages contain useful free space.
0 pages are entirely empty.
CPU 38.15s/37.02u sec elapsed 621.19 sec.
WARNING:  relation "licensing.license" contains more than "max_fsm_pages" pages with useful free space
HINT:  Consider using VACUUM FULL on this relation or increasing the configuration parameter "max_fsm_pages".

A clean restore of the database to another server create a size on disk of about 244GB.  This server was at over 400GB yesterday, and now, after aggressive vacuuming by hand, is down to 350GB.  It had gotten so bad that the backup was not finished when I got in yesterday, almost 8 hours after it started.

The machine has been under heavy load 24/7 for a couple of months, so I have not been able to upgrade versions.  I am taking it offline this weekend and will install the latest.  I'll try to re-create the scenario I had going on yesterday over the weekend and get some io statistics.

Roger

Scott Marlowe wrote:
On Fri, Feb 13, 2009 at 10:20 AM, Roger Ging <rging@musicreports.com> wrote:
  
Hi,

I'm running vacuum full analyze verbose on a table with 20million rows and
11 indexes.  In top, I'm seeing [pdflush] and postgres: writer process each
using diferent cpu cores, with wait time well above 90% on each of them.
 The vacuum has been running for several hours, and the last thing to show
on screen, over an hour ago, was :

DETAIL:  8577281 index row versions were removed.
736 index pages have been deleted, 736 are currently reusable.
CPU 7.57s/52.52u sec elapsed 381.70 sec.

That's the last index

The vacuum process itself is using less than 2% of a core.
The pg version is 8.3.1 running on Suse.  Hardware is 2X dual core Opterons,
16 GB RAM, 24 drives in RAID 50

It would seem to me that the system is extremely IO bound, but I don't know
how to find out what specifically is wrong here.  Any advice greatly
appreciated.
    

A couple of questions.
Why Vacuum full as opposed to vacuum (regular)?
Why 8.3.1 which has known bugs, instead of 8.3.latest?
What do "vmstat 10" and iostat -x 10 have to say about your drive
arrays while this vacuum is running?