Re: pgsql/src backend/tcop/postgres.c include/misc ...

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Hiroshi Inoue" <Inoue(at)tpf(dot)co(dot)jp>
Cc: pgsql-committers(at)postgresql(dot)org
Subject: Re: pgsql/src backend/tcop/postgres.c include/misc ...
Date: 2002-01-06 18:15:54
Message-ID: 26178.1010340954@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers

"Hiroshi Inoue" <Inoue(at)tpf(dot)co(dot)jp> writes:
>> Doesn't bother me a whole lot; I don't think that's what the die
>> interrupt is for. In my mind the main reason die() exists is to
>> behave reasonably when the system is being shut down and init has
>> sent SIGTERM to all processes.

> In my mind the main reason die() exists is to kill individual
> backends which seems to be in trouble without causing
> the database-wide restart.

[ raises eyebrow ] That isn't recommended procedure or even documented
anywhere, AFAIR. But if you are using SIGTERM for that purpose then you
should certainly want some confidence that the killed backend hasn't
left things in a bad state.

> Before 7.1 QueryCancel flag was checked at the points
> CHECK_FOR_INTERRUPTS are currently placed.
> But the QueryCancel flag had nothing to do with die
> interrupts.

Indeed, and before 7.1 killing a backend with SIGTERM at a random time
was horribly dangerous. I did a bunch of retail patching at one point:

2001-01-12 16:53 tgl

* src/: backend/access/heap/heapam.c,
backend/access/nbtree/nbtinsert.c, backend/access/nbtree/nbtpage.c,
backend/access/transam/xact.c, backend/access/transam/xlog.c,
backend/commands/sequence.c, backend/commands/vacuum.c,
backend/storage/buffer/bufmgr.c, backend/storage/file/fd.c,
backend/storage/ipc/spin.c, backend/storage/lmgr/proc.c,
backend/tcop/postgres.c, backend/utils/cache/temprel.c,
backend/utils/init/postinit.c, backend/utils/mmgr/aset.c,
include/access/xlog.h, include/utils/elog.h: Add more
critical-section calls: all code sections that hold spinlocks are
now critical sections, so as to ensure die() won't interrupt us
while we are munging shared-memory data structures. Avoid insecure
intermediate states in some code that proc_exit will call, like
palloc/pfree. Rename START/END_CRIT_CODE to
START/END_CRIT_SECTION, since that seems to be what people tend to
call them anyway, and make them be called with () like a function
call, in hopes of not confusing pg_indent. I doubt that this is
sufficient to make SIGTERM safe anywhere; there's just too much
code that could get invoked during proc_exit().

and then gave up and proposed the current scheme. See

http://archives.postgresql.org/pgsql-hackers/2001-01/msg00147.php
http://fts.postgresql.org/db/mw/msg.html?mid=1277517
http://fts.postgresql.org/db/mw/msg.html?mid=1277531

The notion of checking at every spinlock boundary seems to have been
tossed out without much thought in
http://fts.postgresql.org/db/mw/msg.html?mid=1277532
As the instigator of that approach I feel entitled to decide that
it wasn't such a good idea after all ;-).

I put a bunch of CHECK_FOR_INTERRUPTS() into various outer loops
yesterday (vacuum, create index, sort --- btw, there are no spinlocks
grabbed or released while sorting, so the old approach didn't help
there anyway). I'm happy to throw in more as we discover we need 'em.
But I don't think it's a good idea to have them occur implicitly in
operations as low-level as grabbing or releasing spinlocks.

regards, tom lane

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message tgl 2002-01-06 21:40:08 pgsql/src/backend/postmaster postmaster.c
Previous Message momjian 2002-01-06 18:12:00 pgsql/doc/src/sgml/ref ecpg-ref.sgml