Re: Archiver not exiting upon crash

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Archiver not exiting upon crash
Date: 2012-05-23 21:21:37
Message-ID: 12512.1337808097@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> Jeff Janes <jeff(dot)janes(at)gmail(dot)com> writes:
>> But what happens if the SIGQUIT is blocked before the system(3) is
>> invoked? Does the ignore take precedence over the block, or does the
>> block take precedence over the ignore, and so the signal is still
>> waiting once the block is reversed after the system(3) is over? I
>> could write a test program to see, but that wouldn't be very good
>> evidence of the portability.

> AFAICT from the POSIX spec for system(3), that would be a bug in
> system().

Actually, on further thought, it seems like there is *necessarily* a
race condition in this. There must be some interval where the child
process has already exited but the waiting parent hasn't de-ignored the
signals. So if SIGQUIT is delivered just then, it must go into the
aether. This, together with the thought that the child process might
accidentally or intentionally ignore the signal, makes me think that
maybe you're right and we need to retransmit the SIGQUIT occasionally.

However, I remain unsatisfied with this idea as an explanation for the
behavior you're seeing. In the first place, that race condition window
ought not be wide enough to allow failure probabilities as high as 10%.
In the second place, that code has been like that for a long while,
so this theory absolutely does not explain why you're seeing a
materially higher probability of failure in HEAD than 9.1. There is
something else going on.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Antonin Houska 2012-05-23 22:01:52 Re: WIP: parameterized function scan
Previous Message Simon Riggs 2012-05-23 21:13:46 Re: Add primary key/unique constraint using prefix columns of an index