Skip site navigation (1) Skip section navigation (2)

Re: elog(FATAL) vs shared memory

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, Martin Pitt <martin(dot)pitt(at)ubuntu(dot)com>, Mark Shuttleworth <mark(at)ubuntu(dot)com>
Subject: Re: elog(FATAL) vs shared memory
Date: 2007-04-27 00:34:23
Message-ID: 200704270034.l3R0YNG12445@momjian.us (view raw or flat)
Thread:
Lists: pgsql-hackers
Where are we on this?

---------------------------------------------------------------------------

Tom Lane wrote:
> In this thread:
> http://archives.postgresql.org/pgsql-bugs/2007-03/msg00145.php
> we eventually determined that the reported lockup had three components:
> 
> (1) something (still not sure what --- Martin and Mark, I'd really like
> to know) was issuing random SIGTERMs to various postgres processes
> including autovacuum.
> 
> (2) if a SIGTERM happens to arrive while btbulkdelete is running,
> the next CHECK_FOR_INTERRUPTS will do elog(FATAL), causing elog.c
> to do proc_exit(0), leaving the vacuum still recorded as active in
> the shared memory array maintained by _bt_start_vacuum/_bt_end_vacuum.
> The PG_TRY block in btbulkdelete doesn't get a chance to clean up.
> 
> (3) eventually, either we try to re-vacuum the same index or
> accumulation of bogus active entries overflows the array.
> Either way, _bt_start_vacuum throws an error, which btbulkdelete
> PG_CATCHes, leading to_bt_end_vacuum trying to re-acquire the LWLock
> already taken by _bt_start_vacuum, meaning that the process hangs up.
> And then so does anything else that needs to take that LWLock...
> 
> Point (3) is already fixed in CVS, but point (2) is a lot nastier.
> What it essentially says is that trying to clean up shared-memory
> state in a PG_TRY block is unsafe: you can't be certain you'll
> get to do it.  Now this is not a big deal during normal SIGTERM or
> SIGQUIT database shutdown, because we're going to abandon the shared
> memory segment anyway.  However, if we ever want to support individual
> session kill via SIGTERM, it's a problem.  Even if we were not
> interested in someday considering that a supported feature, it seems
> that dealing with random SIGTERMs is needed for robustness in at least
> some environments.
> 
> AFAICS, there are basically two ways we might try to approach this:
> 
> Plan A: establish the rule that you mustn't try to clean up shared
> memory state in a PG_CATCH block.  Anything you need to do like that
> has to be handled by an on_shmem_exit hook function, so it will be
> called during a FATAL exit.  (Or maybe you can do it in PG_CATCH for
> normal ERROR cases, but you need a backing on_shmem_exit hook to
> clean up for FATAL.)
> 
> Plan B: change the handling of FATAL errors so that they are thrown
> like normal errors, and the proc_exit call happens only when we get
> out to the outermost control level in postgres.c.  This would mean
> that PG_CATCH blocks get a chance to clean up before the FATAL exit
> happens.  The problem with that is that a non-cooperative PG_CATCH
> block might think it could "recover" from the error, and then the exit
> does not happen at all.  We'd need a coding rule that PG_CATCH blocks
> *must* re-throw FATAL errors, which seems at least as ugly as Plan A.
> In particular, all three of the external-interpreter PLs are willing
> to return errors into the external interpreter, and AFAICS we'd be
> entirely at the mercy of the user-written Perl or Python or Tcl code
> whether it re-throws the error or not.
> 
> So Plan B seems unacceptably fragile.  Does anyone see a way to fix it,
> or perhaps a Plan C with a totally different idea?  Plan A seems pretty
> ugly but it's the best I can come up with.
> 
> 			regards, tom lane
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
>        message can get through to the mailing list cleanly

-- 
  Bruce Momjian  <bruce(at)momjian(dot)us>          http://momjian.us
  EnterpriseDB                               http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

In response to

Responses

pgsql-hackers by date

Next:From: Bruce MomjianDate: 2007-04-27 00:37:49
Subject: Re: Modifying TOAST thresholds
Previous:From: Bruce MomjianDate: 2007-04-27 00:22:01
Subject: Re: Re: [HACKERS] [COMMITTERS] pgsql: Add GUC temp_tablespaces to provide a default location for

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group