Re: bgwriter never dies

From: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
To: "'Tom Lane'" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "'Jan Wieck'" <JanWieck(at)Yahoo(dot)com>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: bgwriter never dies
Date: 2004-02-26 21:44:45
Message-ID: 002701c3fcb1$c18d74f0$0200000a@LaptopDellXP
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>Tom Lane
> Jan Wieck <JanWieck(at)Yahoo(dot)com> writes:
> > Tom Lane wrote:
> >> I don't think we want that. IMHO the preferred behavior if the
> >> postmaster crashes should be like a "smart shutdown" --- you don't
> spawn
> >> any more backends (obviously) but existing backends should be
allowed
> to
> >> run until their clients exit. That's how things have always worked
> >> anyway...
>
> > ... In the case of a postmaster crash, I think
> > something in the system is so wrong that I'd prefer an immediate
> shutdown.
>
> Surely some other people have opinions on this? Hello out there?
>

I would prefer that all backends are allowed (how would you stop them?)
should carry on working on their current transaction only, then quit.
But that sounds like each backened would need to check postmaster status
at end of every transaction - yuk! Is there another way to get that
behaviour?

Your comments about least reliable component bringing rest down is
appropriate. You should assume that the backends are doing something
very important and should never be interfered with - like a very long
running transaction that is mere seconds away from committing. Besides,
this might encourage some kind of denial of service attacks... Overall,
my feeling is that a broken postmaster could be bad, but so could a
malfunctioning "fail safe" feature, so immediate shutdown wouldn't
necessarily get you out of the **** in the way that it seems it might.

If the postmaster crashes, then you might get the situation that you
have one person still connected, yet are unable to connect others. That
would be very annoying with many connected users - admittedly not much
problem if you're using external session pooling. You can't restart the
postmaster with one backend still up can you? I hope not, that sounds
bad: convince me! But if you can't its in everybody else's interests for
that last guy to stop cleanly, but earlier than their own convenience,
to allow the whole system to be restarted.

Oracle uses a PMON process to monitor for this situation: Oracle SMON
process is similar to postmaster/bg_writer. If SMON dies, PMON will
restart it. Should we have a pgmon process that watches the postmaster
and restarts it if required?

Overall, we need a clear statement of how this works, so things like the
archiver process for PITR knows when to stop/start etc. My suggestion
would be to draw out the finite state machine, so there's never a case
when we accidentally turn off archiving when there's some part of pg
still up.

Best regards, Simon Riggs

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Gaetano Mendola 2004-02-26 21:52:04 Re: CVS HEAD compile warning
Previous Message James Rogers 2004-02-26 21:41:34 Re: Tablespaces