Re: bgwriter never dies

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Philip Warner <pjw(at)rhyme(dot)com(dot)au>
Cc: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, Neil Conway <neilc(at)samurai(dot)com>, Jan Wieck <JanWieck(at)Yahoo(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: bgwriter never dies
Date: 2004-02-26 05:01:41
Message-ID: 9142.1077771701@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Philip Warner <pjw(at)rhyme(dot)com(dot)au> writes:
> I'm not event sure I'd go with the rollback; whatever killed the PM may
> make the rest of the system unstable. I'd prefer to see the transactions
> rolled back (if necessary) as part of the log recovery on PM startup, not
> by possibly dying PG proceses.

Well, in the first place "rollback" is not an explicit action in
Postgres; you're thinking of Oracle or some other old-line technology.
There's nothing that has to happen to undo the effects of a failed
transaction.

But my real problem with the above line of reasoning is that there is
no basis for assuming that a postmaster failure has anything to do with
problems at the backend level. We have always gone out of our way to
ensure that the postmaster is disconnected from backend failure causes
--- it doesn't touch any but the simplest shared-memory datastructures,
for example. This design rule exists mostly to try to ensure that the
postmaster will survive backend crashes, but the effects cut both ways:
there is no reason that a backend won't survive a postmaster crash.
In practice, the few postmaster crashes I've seen have been due to
localized bugs in postmaster-only code or a Linux kernel randomly
seizing on the postmaster as the victim for an out-of-memory kill.
I have never seen the postmaster crash as a result of backend-level
problems, and if I did I'd be out to fix it immediately.

So my opinion is that "kill all the backends when the postmaster
crashes" is a bad idea that will only result in a net reduction in
system reliability. There is no point in building insulated independent
components if you then put in logic to force the system uptime to be the
minimum of the component uptimes.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2004-02-26 05:17:14 Re: simple make check failures
Previous Message Tom Lane 2004-02-26 04:46:03 Re: CVS HEAD compile warning