Re: autovacuum crash due to null pointer

From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: autovacuum crash due to null pointer
Date: 2008-07-17 14:15:16
Message-ID: 20080717141516.GA3934@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:
> There's a fairly interesting crash here:
> http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=jaguar&dt=2008-07-16%2003:00:02
> The buildfarm was nice enough to provide a stack trace at the bottom of
> the page, which shows clearly that autovac tried to pfree a null
> pointer.
>
> What I think happened was that the table that was selected to be
> autovacuumed got dropped during the setup steps, leading get_rel_name()
> to return NULL at line 2167. vacuum() itself would have fallen out
> silently ... however, had it errored out, the attempts at error
> reporting in the PG_CATCH block would have crashed.

Ouch. This was introduced when we added the PG_TRY block to add the
errcontext and resume vacuuming with the next table; the original code
was clean about this problem.

> I see that we already noticed and fixed this type of problem in
> autovac_report_activity(), but do_autovacuum() didn't get the word.
> Is there anyplace else in there with the same issue?

I'll have a more detailed look.

> For that matter, why is autovac_report_activity repeating the lookups
> already done at the outer level?

Ah, good catch -- it's because the outer code was added later. It
should be easy to pass the names down, I think.

> One other point is that the postmaster log just says
>
> TRAP: FailedAssertion("!(pointer != ((void *)0))", File: "mcxt.c", Line: 580)
> [487d6715.3a87:2] LOG: server process (PID 16885) was terminated by signal 6: Aborted
>
> Could we get that to say "autovacuum worker" instead of "server"?

Hmm, that would require moving code from HandleChildCrash to
CleanupBackend, I think. (We're creating the process name in
CleanupBackend without scanning the BackendList, which we would need to
figure out if it's a worker.)

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jonah H. Harris 2008-07-17 14:34:04 Re: [PATCH]-hash index improving
Previous Message Pavel Stehule 2008-07-17 12:11:44 Re: hash distinct