Re: Race-condition with failed block-write?

From: Arjen van der Meijden <acm(at)tweakers(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: Race-condition with failed block-write?
Date: 2005-09-13 21:41:36
Message-ID: 43274790.9020609@tweakers.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 13-9-2005 23:01, Tom Lane wrote:
> Arjen van der Meijden <acm(at)tweakers(dot)net> writes:
>
> That's all? There's something awfully suspicious about that. You're
> sure this is 8.0.3?

When I do "select version();"

PostgreSQL 8.0.3 on i686-pc-linux-gnu, compiled by GCC
i686-pc-linux-gnu-gcc (GCC) 3.3.5-20050130 (Gentoo Linux
3.3.5.20050130-r1, ssp-3.3.5.20050130-1, pie-8.7.7.1)

I checked the emerge.log from Gentoo and that sais I installed version
8.0.3 at the 15th of August. So well before that September the 1st.

> AFAICS it is absolutely impossible for the 8.0
> postmaster.c code to emit "received smart shutdown request" after
> emitting "received fast shutdown request". The SIGINT code looks like
>
>
> and the SIGTERM code looks like
>
> and there are no other places that change the value of Shutdown, and
> certainly FastShutdown > SmartShutdown. So I wonder if something got
> lost in the log entries.

That'd surprise me, but it would explain this behaviour. I doubt though
that much happened in those 11 seconds that are missing. It can't have
been a start-up without logging, since it wouldn't have logged the
shut-down then, would it?
Besides that, I normally start it and shut it down using the
/etc/init.d-scripts. And that script issues a fast-shutdown, so a
smart-shutdown should not be necessary anymore.

> Another question is why the postmaster didn't exit at 12:36:50. It was
> not waiting on any backends, else it would not have launched the
> shutdown process (which is what emits the other two messages).
>
> [ thinks for a bit ... ] I wonder if Shutdown ought to be marked
> volatile, since it is after all changed by a signal handler. But given
> the way the postmaster is coded, this doesn't seem likely to be an issue.
> Basically all of the code runs with signals blocked.
>
> Can you try to reconstruct what you did on Sep 1, and see whether you
> can reproduce the above behavior?

The only time I really recall having trouble with shutting it down was
when the memory had leaked up all system memory (at 9-9). I don't know
what happened at 1-9 anymore, as far as I remember and can read back
from the log I just (tried to) shut it down. Most likely I tried to shut
it down to free up some extra memory for the postgres 8.1, running at
that time, the 8.0.3 wasn't in use anyway.
I'll try and see if I can dig up more from the logs and see if I can
test a few reasonable scenario's tomorrow though.

Best regards,

Arjen

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Arjen van der Meijden 2005-09-14 08:58:28 Re: Race-condition with failed block-write?
Previous Message Tom Lane 2005-09-13 21:01:37 Re: Race-condition with failed block-write?