VACUUM FULL taking accessivly long.

From: Darcy Buskermolen <darcy(at)wavefire(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: VACUUM FULL taking accessivly long.
Date: 2004-09-14 21:06:32
Message-ID: 200409141406.32352.darcy@wavefire.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

While doing some testing for someone, I have run across what appears to be a
bug that is only readily reproduceable in FreeBSD 4.x (dual Xeon 2.4 GHz,
2GB RAM, PG 7.4). The same test run on a pair of linux boxen, work as
expected.

This process has now been running 25 hours, is still running, and if left will
run indefantaly (I aborted that last one after it was churning away for 7
days, so that I could rebuild PG with debugging symbols)

There is no other access to this DB, let alone this table.

darcy=# \d archives_event
Table "public.archives_event"
Column | Type | Modifiers
-----------+--------------------------+-----------
id | integer |
sid | integer |
gid | integer |
cid | integer |
ts_old | integer |
num_child | integer |
timestamp | timestamp with time zone |
Indexes:
"ae_tsold_num_child_idx" btree (num_child, ts_old)
"archives_event_ts_num_child_idx" btree (num_child, "timestamp")

darcy=# VACUUM FULL verbose archives_event;
INFO: vacuuming "public.archives_event"
INFO: "archives_event": found 34879340 removable, 90000000 nonremovable row
versions in 1330139 pages
DETAIL: 0 dead row versions cannot be removed yet.
Nonremovable row versions range from 60 to 60 bytes long.
There were 55220660 unused item pointers.
Total free space (including removable row versions) is 4749495908 bytes.
621477 pages are or will become empty, including 0 at the end of the table.
621479 pages containing 4718314868 free bytes are potential move destinations.
CPU 41.48s/6.00u sec elapsed 123.21 sec.

INFO: index "archives_event_ts_num_child_idx" now contains 90000000 row
versions in 592668 pages
DETAIL: 3753431 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 56.75s/26.49u sec elapsed 1888.69 sec.


INFO: index "ae_tsold_num_child_idx" now contains 90000000 row versions in
493544 pages
DETAIL: 34879340 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 66.63s/192.01u sec elapsed 1983.45 sec.


$ps

USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND
pgsql 9820 2.2 20.4 438656 427300 ?? D Mon08AM 51:49.86 postmaster:
darcy darcy [local] VACUUM (postgres)

$iostat -w1
tty ad0 da0 acd0 cpu
tin tout KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s us ni sy in id
0 19 0.00 0 0.00 16.71 785 12.81 0.00 0 0.00 0 0 1 0 99
0 19 0.00 0 0.00 21.94 184 3.94 0.00 0 0.00 1 0 0 0 99
0 19 12.67 6 0.07 21.59 186 3.92 0.00 0 0.00 1 0 2 0 97
^C

The da0 array is capable of 60MB/second random IO write.

I've attached the the process and obtained the following back trace:

(gdb) bt
#0 0x28377088 in read () from /usr/lib/libc.so.4
#1 0x815b2f8 in FileRead (file=36, buffer=0x302c7520 "s\001", amount=8192) at
fd.c:904
#2 0x81661d0 in mdread (reln=0x938a01c, blocknum=467914, buffer=0x302c7520
"s\001") at md.c:434
#3 0x8166f5f in smgrread (which=0, reln=0x938a01c, blocknum=467914,
buffer=0x302c7520 "s\001") at smgr.c:316
#4 0x8157fc3 in ReadBufferInternal (reln=0x938a01c, blockNum=467914,
bufferLockHeld=0 '\000') at bufmgr.c:224
#5 0x8157db2 in ReadBuffer (reln=0x938a01c, blockNum=467914) at bufmgr.c:118
#6 0x8088842 in _bt_getbuf (rel=0x938a01c, blkno=467914, access=1) at
nbtpage.c:401
#7 0x808b45f in btbulkdelete (fcinfo=0xbfbfe8bc) at nbtree.c:674
#8 0x81ca5ba in OidFunctionCall3 (functionId=332, arg1=154705948,
arg2=135251912, arg3=3217025732) at fmgr.c:1260
#9 0x8084a5c in index_bulk_delete (indexRelation=0x938a01c,
callback=0x80fc7c8 <tid_reaped>, callback_state=0xbfbfeac4) at indexam.c:618
#10 0x80fc666 in vacuum_index (vacpagelist=0xbfbfeac4, indrel=0x938a01c,
num_tuples=90000000, keep_tuples=0) at vacuum.c:2702
#11 0x80f893b in full_vacuum_rel (onerel=0x833ed30, vacstmt=0x82f5354) at
vacuum.c:944
#12 0x80f8818 in vacuum_rel (relid=84356092, vacstmt=0x82f5354,
expected_relkind=114 'r') at vacuum.c:853
#13 0x80f7f55 in vacuum (vacstmt=0x82f5354) at vacuum.c:290
#14 0x816d3d5 in ProcessUtility (parsetree=0x82f5354, dest=0x82f5234,
completionTag=0xbfbfecd4 "") at utility.c:827
#15 0x816bf8b in PortalRunUtility (portal=0x834501c, query=0x82f539c,
dest=0x82f5234, completionTag=0xbfbfecd4 "") at pquery.c:758
#16 0x816c08b in PortalRunMulti (portal=0x834501c, dest=0x82f5234,
altdest=0x82f5234, completionTag=0xbfbfecd4 "") at pquery.c:822
#17 0x816bbc2 in PortalRun (portal=0x834501c, count=2147483647,
dest=0x82f5234, altdest=0x82f5234, completionTag=0xbfbfecd4 "") at
pquery.c:483
#18 0x8168c9b in exec_simple_query (query_string=0x82f501c " VACUUM FULL
verbose archives_event;") at postgres.c:873
#19 0x816af53 in PostgresMain (argc=4, argv=0x82b84b4, username=0x82b847c
"darcy") at postgres.c:2868
#20 0x8147a11 in BackendFork (port=0x82bc200) at postmaster.c:2558
#21 0x8147247 in BackendStartup (port=0x82bc200) at postmaster.c:2201
#22 0x8145bae in ServerLoop () at postmaster.c:1113
#23 0x814570d in PostmasterMain (argc=1, argv=0x82b6030) at postmaster.c:891
#24 0x811a885 in main (argc=1, argv=0xbfbffcac) at main.c:214

Please let me know if you require any other information.

--
Darcy Buskermolen
Wavefire Technologies Corp.
ph: 250.717.0200
fx: 250.763.1759
http://www.wavefire.com

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim C. Nasby 2004-09-14 21:51:14 Re: APR 1.0 released
Previous Message Oliver Elphick 2004-09-14 20:59:17 Re: NEW used in a query that is not in a rule