Re: Help: 8.0.3 Vacuum of an empty table never completes ...

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: James Robinson <jlrobins(at)socialserve(dot)com>
Cc: Hackers Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Help: 8.0.3 Vacuum of an empty table never completes ...
Date: 2005-11-28 17:00:04
Message-ID: 11653.1133197204@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

James Robinson <jlrobins(at)socialserve(dot)com> writes:
> On Nov 28, 2005, at 11:38 AM, Tom Lane wrote:
>> Can you get a similar backtrace from the vacuumdb process?

> OK:

> (gdb) bt
> #0 0xffffe410 in ?? ()
> #1 0xbfffe4f8 in ?? ()
> #2 0x00000030 in ?? ()
> #3 0x08057b68 in ?? ()
> #4 0xb7e98533 in __write_nocancel () from /lib/tls/libc.so.6
> #5 0xb7e4aae6 in _IO_new_file_write () from /lib/tls/libc.so.6
> #6 0xb7e4a7e5 in new_do_write () from /lib/tls/libc.so.6
> #7 0xb7e4aa63 in _IO_new_file_xsputn () from /lib/tls/libc.so.6
> #8 0xb7e413a2 in fputs () from /lib/tls/libc.so.6
> #9 0xb7fd8f99 in defaultNoticeProcessor () from /usr/local/pgsql/lib/
> libpq.so.4
> #10 0xb7fd8fe5 in defaultNoticeReceiver () from /usr/local/pgsql/lib/
> libpq.so.4
> #11 0xb7fe2d34 in pqGetErrorNotice3 () from /usr/local/pgsql/lib/
> libpq.so.4
> #12 0xb7fe3921 in pqParseInput3 () from /usr/local/pgsql/lib/libpq.so.4
> #13 0xb7fdb174 in parseInput () from /usr/local/pgsql/lib/libpq.so.4
> #14 0xb7fdca99 in PQgetResult () from /usr/local/pgsql/lib/libpq.so.4
> #15 0xb7fdcc4b in PQexecFinish () from /usr/local/pgsql/lib/libpq.so.4
> #16 0x0804942c in vacuum_one_database ()
> #17 0x080497a1 in main ()

OK, so evidently the backend is sending NOTICE messages, and the
vacuumdb is blocked trying to copy those messages to stderr.

> Things to know which could possibly be of use. This cron is kicked
> off on the backup database box, and the vacuumdb is run via ssh to
> the primary box. The primary box is running the vacuumdb operation
> with --analyze --verbose, with the output being streamed to a logfile
> on the backup box. Lemme guess __write_nocancel calls syscall write,
> and 0x00000030 might could well be the syscall entry point? Something
> gumming up the networking or sshd itself could have stopped up the
> ouput queues, and the backups populated all the way down to this level?

That's what it looks like: the output queue from the vacuumdb has
stopped up somehow. Your next move is to look at the state of sshd
and whatever is running at the client end of the ssh tunnel.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2005-11-28 17:08:53 Re: Getting different number of results when using hashjoin on/off
Previous Message James Robinson 2005-11-28 16:48:33 Re: Help: 8.0.3 Vacuum of an empty table never completes ...