Skip site navigation (1) Skip section navigation (2)

Re: Reasoning behind process instead of thread based

From: Thomas Hallgren <thhal(at)mailblocks(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: nd02tsk(at)student(dot)hig(dot)se, pgsql-general(at)postgresql(dot)org
Subject: Re: Reasoning behind process instead of thread based
Date: 2004-10-27 22:48:51
Message-ID: 418025D3.5090205@mailblocks.com (view raw or flat)
Thread:
Lists: pgsql-general
Tom Lane wrote:
> Right.  Depending on your OS you may be able to catch a signal that
> would kill a thread and keep it from killing the whole process, but
> this still leaves you with a process memory space that may or may not
> be corrupted.  Continuing in that situation is not cool, at least not
> according to the Postgres project's notions of reliable software design.
> 
There can't be any "may or may not" involved. You must of course know 
what went wrong.

It is very common that you either get a null pointer exception (attempt 
to access address zero), that your stack will hit a write protected page 
(stack overflow), or that you get some sort of arithemtic exception. 
These conditions can be trapped and gracefully handled. The signal 
handler must be able to check the cause of the exception. This usually 
involves stack unwinding and investingating the state of the CPU at the 
point where the signal was generated. The process must be terminated if 
the reason is not a recognized one.

Out of memory can be managed using thread local allocation areas 
(similar to MemoryContext) and killing a thread based on some criteria 
when no more memory is available. A criteria could be the thread that 
encountered the problem, the thread that consumes the most memory, the 
thread that was least recently active, or something else.

> It should be pointed out that when we get a hard backend crash, Postgres
> will forcibly terminate all the backends and reinitialize; which means
> that in terms of letting concurrent sessions keep going, we are not any
> more forgiving than a single-address-space multithreaded server.  The
> real bottom line here is that we have good prospects of confining the
> damage done by the failed process: it's unlikely that anything bad will
> happen to already-committed data on disk or that any other sessions will
> return wrong answers to their clients before we are able to kill them.
> It'd be a lot harder to say that with any assurance for a multithreaded
> server.
> 
I'm not sure I follow. You will be able to bring all threads of one 
process to a halt much faster than you can kill a number of external 
processes. Killing the multithreaded process is more like pulling the plug.

Regards,
Thomas Hallgren

In response to

Responses

pgsql-general by date

Next:From: Michael FuhrDate: 2004-10-27 22:52:28
Subject: Re: interval to seconds conversion. How?
Previous:From: Robby RussellDate: 2004-10-27 22:45:14
Subject: Re: interval to seconds conversion. How?

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group