BUG #5849: Stats Collector Frozen - Autovacuum Not Working Anymore

From: "Radu Ilie" <rilie(at)wsi(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #5849: Stats Collector Frozen - Autovacuum Not Working Anymore
Date: 2011-01-26 13:49:21
Message-ID: 201101261349.p0QDnLfg059307@wwwmaster.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs


The following bug has been logged online:

Bug reference: 5849
Logged by: Radu Ilie
Email address: rilie(at)wsi(dot)com
PostgreSQL version: 8.4.4 32 bit
Operating system: Windows 7 Professional 64 bit
Description: Stats Collector Frozen - Autovacuum Not Working Anymore
Details:

We started noticing very poor performance one day ago. Turns out the
autovacuum was no longer working. There were tables of size 4GB when their
normal size should be around 400MB. No table was autovacuumed in the last
day. When we tried a manual "VACUUM FULL" on one of the tables, we got
"pgstat wait timeout" at psql console and the table was not shrinked. When
looking in the Postgres log, it was full of messages about "pgstat wait
timeout". We Googled for it and noticed it may be about the stats collector
being stuck.

So we restarted Postgres. It ran fine for 90 minutes, after which time the
log messages with "pgstat wait timeout" returned. So we followed the
instructions for obtaining the stack trace of the stats collector process
from windbg. We used the 32bit version of the debugging tools. We took the
stack trace at 3 different moments, about 1 minute apart. Stack trace
below:

Microsoft (R) Windows Debugger Version 6.12.0002.633 X86

Copyright (c) Microsoft Corporation. All rights reserved.

*** wait with pending attach

Symbol search path is: C:\Program Files
(x86)\PostgreSQL\8.4\symbols;SRV*c:\localsymbols*http://msdl.microsoft.com/d
ownload/symbols

Executable search path is:

ModLoad: 00400000 008b2000 C:\Program Files
(x86)\PostgreSQL\8.4\bin\postgres.exe

ModLoad: 77330000 774b0000 C:\Windows\SysWOW64\ntdll.dll

ModLoad: 76730000 76830000 C:\Windows\syswow64\kernel32.dll

ModLoad: 75050000 75096000 C:\Windows\syswow64\KERNELBASE.dll

ModLoad: 10000000 10034000 C:\Program Files
(x86)\PostgreSQL\8.4\bin\SSLEAY32.dll

ModLoad: 00290000 0038f000 C:\Program Files
(x86)\PostgreSQL\8.4\bin\LIBEAY32.dll

ModLoad: 73cf0000 73cf7000 C:\Windows\system32\WSOCK32.dll

ModLoad: 76b30000 76b65000 C:\Windows\syswow64\WS2_32.dll

ModLoad: 76070000 7611c000 C:\Windows\syswow64\msvcrt.dll

ModLoad: 76b70000 76c60000 C:\Windows\syswow64\RPCRT4.dll

ModLoad: 74ea0000 74f00000 C:\Windows\syswow64\SspiCli.dll

ModLoad: 74e90000 74e9c000 C:\Windows\syswow64\CRYPTBASE.dll

ModLoad: 76e60000 76e79000 C:\Windows\SysWOW64\sechost.dll

ModLoad: 75eb0000 75eb6000 C:\Windows\syswow64\NSI.dll

ModLoad: 76200000 76290000 C:\Windows\syswow64\GDI32.dll

ModLoad: 76630000 76730000 C:\Windows\syswow64\USER32.dll

ModLoad: 76a90000 76b30000 C:\Windows\syswow64\ADVAPI32.dll

ModLoad: 76150000 7615a000 C:\Windows\syswow64\LPK.dll

ModLoad: 76160000 761fd000 C:\Windows\syswow64\USP10.dll

ModLoad: 73d90000 73e2b000
C:\Windows\WinSxS\x86_microsoft.vc80.crt_1fc8b3b9a1e18e3b_8.0.50727.4927_non
e_d08a205e442db5b5\MSVCR80.dll

ModLoad: 61cc0000 61cd3000 C:\Program Files
(x86)\PostgreSQL\8.4\bin\libintl-8.dll

ModLoad: 66000000 660e7000 C:\Program Files
(x86)\PostgreSQL\8.4\bin\libiconv-2.dll

ModLoad: 1c000000 1c09b000 C:\Program Files
(x86)\PostgreSQL\8.4\bin\krb5_32.dll

ModLoad: 00140000 00147000 C:\Program Files
(x86)\PostgreSQL\8.4\bin\comerr32.dll

ModLoad: 00150000 00158000 C:\Program Files
(x86)\PostgreSQL\8.4\bin\k5sprt32.dll

ModLoad: 7c340000 7c396000 C:\Program Files
(x86)\PostgreSQL\8.4\bin\MSVCR71.dll

ModLoad: 00160000 00181000 C:\Program Files
(x86)\PostgreSQL\8.4\bin\gssapi32.dll

ModLoad: 00cc0000 00db1000 C:\Program Files
(x86)\PostgreSQL\8.4\bin\libxml2.dll

ModLoad: 00f10000 00fe9000 C:\Program Files
(x86)\PostgreSQL\8.4\bin\iconv.dll

ModLoad: 00190000 001a3000 C:\Program Files
(x86)\PostgreSQL\8.4\bin\zlib1.dll

ModLoad: 73bb0000 73bb8000 C:\Windows\system32\Secur32.dll

ModLoad: 76900000 76945000 C:\Windows\syswow64\WLDAP32.dll

ModLoad: 74ff0000 75050000 C:\Windows\system32\IMM32.DLL

ModLoad: 76830000 768fc000 C:\Windows\syswow64\MSCTF.dll

ModLoad: 73fe0000 7401c000 C:\Windows\system32\mswsock.dll

ModLoad: 73f80000 73f86000 C:\Windows\System32\wship6.dll

ModLoad: 73f90000 73f95000 C:\Windows\System32\wshtcpip.dll

(2a60.25b8): Break instruction exception - code 80000003 (first chance)

eax=7efd7000 ebx=00000000 ecx=00000000 edx=773cfa82 esi=00000000
edi=00000000

eip=7734000c esp=0210ff5c ebp=0210ff88 iopl=0 nv up ei pl zr na pe
nc

cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b
efl=00000246

ntdll!DbgBreakPoint:

7734000c cc int 3

0:002> ~*k

0 Id: 2a60.2404 Suspend: 1 Teb: 7efdd000 Unfrozen

ChildEBP RetAddr

00cbf788 75060962 ntdll!NtWaitForMultipleObjects+0x15

00cbf824 7674162d KERNELBASE!WaitForMultipleObjectsEx+0x100

00cbf86c 0058aff0 kernel32!WaitForMultipleObjectsExImplementation+0xe0

00cbf8b4 0058b210 postgres!pgwin32_waitforsinglesocket+0x1f0
[c:\pginstaller-repo\postgres.windows\src\backend\port\win32\socket.c @
197]

00cbf8e0 00594ff8 postgres!pgwin32_recv+0x90
[c:\pginstaller-repo\postgres.windows\src\backend\port\win32\socket.c @
318]

00cbfcec 005992c2 postgres!PgstatCollectorMain+0x198
[c:\pginstaller-repo\postgres.windows\src\backend\postmaster\pgstat.c @
2762]

00cbff28 00505157 postgres!SubPostmasterMain+0x352
[c:\pginstaller-repo\postgres.windows\src\backend\postmaster\postmaster.c @
4012]

00cbff44 006bb8dd postgres!main+0x177
[c:\pginstaller-repo\postgres.windows\src\backend\main\main.c @ 165]

00cbff88 76743677 postgres!__tmainCRTStartup+0x10f
[f:\sp\vctools\crt_bld\self_x86\crt\src\crtexe.c @ 597]

00cbff94 77369f02 kernel32!BaseThreadInitThunk+0xe

00cbffd4 77369ed5 ntdll!__RtlUserThreadStart+0x70

00cbffec 00000000 ntdll!_RtlUserThreadStart+0x1b

1 Id: 2a60.2aec Suspend: 1 Teb: 7efda000 Unfrozen

ChildEBP RetAddr

01d0fea4 75057a15 ntdll!NtFsControlFile+0x15

01d0fee8 0058a6e7 KERNELBASE!ConnectNamedPipe+0x5d

01d0ff88 76743677 postgres!pg_signal_thread+0x97
[c:\pginstaller-repo\postgres.windows\src\backend\port\win32\signal.c @
275]

01d0ff94 77369f02 kernel32!BaseThreadInitThunk+0xe

01d0ffd4 77369ed5 ntdll!__RtlUserThreadStart+0x70

01d0ffec 00000000 ntdll!_RtlUserThreadStart+0x1b

# 2 Id: 2a60.25b8 Suspend: 1 Teb: 7efd7000 Unfrozen

ChildEBP RetAddr

0210ff58 773cfabe ntdll!DbgBreakPoint

0210ff88 76743677 ntdll!DbgUiRemoteBreakin+0x3c

0210ff94 77369f02 kernel32!BaseThreadInitThunk+0xe

0210ffd4 77369ed5 ntdll!__RtlUserThreadStart+0x70

0210ffec 00000000 ntdll!_RtlUserThreadStart+0x1b

0:002> G

(2a60.26e0): Break instruction exception - code 80000003 (first chance)

eax=7efd7000 ebx=00000000 ecx=00000000 edx=773cfa82 esi=00000000
edi=00000000

eip=7734000c esp=0210ff5c ebp=0210ff88 iopl=0 nv up ei pl zr na pe
nc

cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b
efl=00000246

ntdll!DbgBreakPoint:

7734000c cc int 3

0:002> ~*k

0 Id: 2a60.2404 Suspend: 1 Teb: 7efdd000 Unfrozen

ChildEBP RetAddr

00cbf788 75060962 ntdll!NtWaitForMultipleObjects+0x15

00cbf824 7674162d KERNELBASE!WaitForMultipleObjectsEx+0x100

00cbf86c 0058aff0 kernel32!WaitForMultipleObjectsExImplementation+0xe0

00cbf8b4 0058b210 postgres!pgwin32_waitforsinglesocket+0x1f0
[c:\pginstaller-repo\postgres.windows\src\backend\port\win32\socket.c @
197]

00cbf8e0 00594ff8 postgres!pgwin32_recv+0x90
[c:\pginstaller-repo\postgres.windows\src\backend\port\win32\socket.c @
318]

00cbfcec 005992c2 postgres!PgstatCollectorMain+0x198
[c:\pginstaller-repo\postgres.windows\src\backend\postmaster\pgstat.c @
2762]

00cbff28 00505157 postgres!SubPostmasterMain+0x352
[c:\pginstaller-repo\postgres.windows\src\backend\postmaster\postmaster.c @
4012]

00cbff44 006bb8dd postgres!main+0x177
[c:\pginstaller-repo\postgres.windows\src\backend\main\main.c @ 165]

00cbff88 76743677 postgres!__tmainCRTStartup+0x10f
[f:\sp\vctools\crt_bld\self_x86\crt\src\crtexe.c @ 597]

00cbff94 77369f02 kernel32!BaseThreadInitThunk+0xe

00cbffd4 77369ed5 ntdll!__RtlUserThreadStart+0x70

00cbffec 00000000 ntdll!_RtlUserThreadStart+0x1b

1 Id: 2a60.2aec Suspend: 1 Teb: 7efda000 Unfrozen

ChildEBP RetAddr

01d0fea4 75057a15 ntdll!NtFsControlFile+0x15

01d0fee8 0058a6e7 KERNELBASE!ConnectNamedPipe+0x5d

01d0ff88 76743677 postgres!pg_signal_thread+0x97
[c:\pginstaller-repo\postgres.windows\src\backend\port\win32\signal.c @
275]

01d0ff94 77369f02 kernel32!BaseThreadInitThunk+0xe

01d0ffd4 77369ed5 ntdll!__RtlUserThreadStart+0x70

01d0ffec 00000000 ntdll!_RtlUserThreadStart+0x1b

# 2 Id: 2a60.26e0 Suspend: 1 Teb: 7efd7000 Unfrozen

ChildEBP RetAddr

0210ff58 773cfabe ntdll!DbgBreakPoint

0210ff88 76743677 ntdll!DbgUiRemoteBreakin+0x3c

0210ff94 77369f02 kernel32!BaseThreadInitThunk+0xe

0210ffd4 77369ed5 ntdll!__RtlUserThreadStart+0x70

0210ffec 00000000 ntdll!_RtlUserThreadStart+0x1b

0:002> G

(2a60.28d4): Break instruction exception - code 80000003 (first chance)

eax=7efd7000 ebx=00000000 ecx=00000000 edx=773cfa82 esi=00000000
edi=00000000

eip=7734000c esp=0210ff5c ebp=0210ff88 iopl=0 nv up ei pl zr na pe
nc

cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b
efl=00000246

ntdll!DbgBreakPoint:

7734000c cc int 3

0:002> ~*k

0 Id: 2a60.2404 Suspend: 1 Teb: 7efdd000 Unfrozen

ChildEBP RetAddr

00cbf788 75060962 ntdll!NtWaitForMultipleObjects+0x15

00cbf824 7674162d KERNELBASE!WaitForMultipleObjectsEx+0x100

00cbf86c 0058aff0 kernel32!WaitForMultipleObjectsExImplementation+0xe0

00cbf8b4 0058b210 postgres!pgwin32_waitforsinglesocket+0x1f0
[c:\pginstaller-repo\postgres.windows\src\backend\port\win32\socket.c @
197]

00cbf8e0 00594ff8 postgres!pgwin32_recv+0x90
[c:\pginstaller-repo\postgres.windows\src\backend\port\win32\socket.c @
318]

00cbfcec 005992c2 postgres!PgstatCollectorMain+0x198
[c:\pginstaller-repo\postgres.windows\src\backend\postmaster\pgstat.c @
2762]

00cbff28 00505157 postgres!SubPostmasterMain+0x352
[c:\pginstaller-repo\postgres.windows\src\backend\postmaster\postmaster.c @
4012]

00cbff44 006bb8dd postgres!main+0x177
[c:\pginstaller-repo\postgres.windows\src\backend\main\main.c @ 165]

00cbff88 76743677 postgres!__tmainCRTStartup+0x10f
[f:\sp\vctools\crt_bld\self_x86\crt\src\crtexe.c @ 597]

00cbff94 77369f02 kernel32!BaseThreadInitThunk+0xe

00cbffd4 77369ed5 ntdll!__RtlUserThreadStart+0x70

00cbffec 00000000 ntdll!_RtlUserThreadStart+0x1b

1 Id: 2a60.2aec Suspend: 1 Teb: 7efda000 Unfrozen

ChildEBP RetAddr

01d0fea4 75057a15 ntdll!NtFsControlFile+0x15

01d0fee8 0058a6e7 KERNELBASE!ConnectNamedPipe+0x5d

01d0ff88 76743677 postgres!pg_signal_thread+0x97
[c:\pginstaller-repo\postgres.windows\src\backend\port\win32\signal.c @
275]

01d0ff94 77369f02 kernel32!BaseThreadInitThunk+0xe

01d0ffd4 77369ed5 ntdll!__RtlUserThreadStart+0x70

01d0ffec 00000000 ntdll!_RtlUserThreadStart+0x1b

# 2 Id: 2a60.28d4 Suspend: 1 Teb: 7efd7000 Unfrozen

ChildEBP RetAddr

0210ff58 773cfabe ntdll!DbgBreakPoint

0210ff88 76743677 ntdll!DbgUiRemoteBreakin+0x3c

0210ff94 77369f02 kernel32!BaseThreadInitThunk+0xe

0210ffd4 77369ed5 ntdll!__RtlUserThreadStart+0x70

0210ffec 00000000 ntdll!_RtlUserThreadStart+0x1b

This was a production server, so we had to get it going. We restarted
Postgres, we stopped all other activity on the server and tried vacuumming
by hand. The VACUUM manual command for a 4GB table did not end after 30
minutes, so we stopped it and started truncating the big tables (the data is
highly transient and we got it back within several hours). We did manage to
VACUUM the smaller tables (from 400MB to about 200MB). After this, we
restarted all services using the database.

It's been 16 hours now and the "pgstat wait timeout" messages did not
return. The pg_stat_user_tables shows autovacuum working normal on all
tables.

Any idea why the stats collector was stuck?

Please let me know if you need any more information.

Regards,
Radu Ilie

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Jens Kapp 2011-01-26 16:17:51 NO DATA error message in Frontend when querying large datasets
Previous Message Tom Lane 2011-01-26 03:09:52 Re: Multicolun index creation never completes on 9.0.1/solaris