autovacuum crash due to null pointer

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: autovacuum crash due to null pointer
Date: 2008-07-16 14:54:38
Message-ID: 8207.1216220078@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

There's a fairly interesting crash here:
http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=jaguar&dt=2008-07-16%2003:00:02
The buildfarm was nice enough to provide a stack trace at the bottom of
the page, which shows clearly that autovac tried to pfree a null
pointer.

What I think happened was that the table that was selected to be
autovacuumed got dropped during the setup steps, leading get_rel_name()
to return NULL at line 2167. vacuum() itself would have fallen out
silently ... however, had it errored out, the attempts at error
reporting in the PG_CATCH block would have crashed.

I see that we already noticed and fixed this type of problem in
autovac_report_activity(), but do_autovacuum() didn't get the word.
Is there anyplace else in there with the same issue? For that matter,
why is autovac_report_activity repeating the lookups already done
at the outer level?

One other point is that the postmaster log just says

TRAP: FailedAssertion("!(pointer != ((void *)0))", File: "mcxt.c", Line: 580)
[487d6715.3a87:2] LOG: server process (PID 16885) was terminated by signal 6: Aborted

Could we get that to say "autovacuum worker" instead of "server"?

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2008-07-16 15:30:23 Re: Overhauling GUCS
Previous Message daveg 2008-07-16 12:37:36 Re: [PATCHES] pg_dump lock timeout