Re: Race condition in backend process exit

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Race condition in backend process exit
Date: 2005-08-07 23:37:56
Message-ID: 4504.1123457876@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
>> I could not provoke the same crash in 8.0, but I suspect this may just
>> be a chance timing difference, and that the bug may be of long standing.

> I haven't done the experiment, but I'm pretty certain that it's possible
> to provoke this same crash in 8.0 if the timing is right, which could be
> forced by using gdb to delay execution at the right place in ProcKill.

Having done the experiment, I can now say that 8.0 and prior versions
are *not* vulnerable, but the reason is, um, subtle. The actual
execution order of on_shmem_exit callbacks in an exiting backend is

ShutdownPostgres
CleanupInvalidationState
ProcKill

CleanupInvalidationState removes the backend from the SI invalidation
message ring. Until I recently refactored the code to separate the
PGPROC array from the SI mechanism, that had the side effect of making
the backend's PGPROC disappear from the set visible to
TransactionIdIsInProgress. Which means that in fact the released
versions do honor the rule "stop being in-progress before you release
locks".

This behavior is obviously mighty fragile, not to say undocumented,
so I'm still strongly inclined to make ShutdownPostgres do a normal
transaction abort sequence. But I'm no longer very excited about
back-patching it.

> This bug may well explain the known reports of failures from SIGTERM'ing
> an individual backend, since (IIRC) that code path could also try to
> exit the backend with a transaction still in progress.

The particular issue exhibited here evidently isn't the explanation
for SIGTERM problems in existing releases ... but I still suspect that
those reports might have something to do with ShutdownPostgres taking
shortcuts with transaction abort.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2005-08-07 23:39:50 Re: shrinking the postgresql.conf
Previous Message Satoshi Nagayasu 2005-08-07 23:07:05 Re: enable/disable trigger (Re: Fwd: [HACKERS] Open items)