Re: Minor race-condition problem during database startup

From: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Minor race-condition problem during database startup
Date: 2008-11-24 08:35:35
Message-ID: 492A6757.1020003@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane napsal(a):

> What seems to have happened is that the bgwriter didn't get as far as
> the first line of BackgroundWriterMain before the client backend tried
> to issue a checkpoint request.
>
> This is obviously a pretty minor issue, but it still seems worth fixing.
> We could either try to make sure that BgWriterShmem->bgwriter_pid gets
> set before the postmaster "opens its doors" for clients, or allow
> RequestCheckpoint() to wait a little bit if needed for the bgwriter
> to come ready. The latter seems like a more localized change.

I think, postmaster should wait until bgwriter is not up.

Another strange thing in RequestCheckpoint() is following code:

00926 else if (kill(BgWriterShmem->bgwriter_pid, SIGINT) != 0)
00927 {
00928 if (ntries >= 20) /* max wait 2.0 sec */
00929 {
00930 elog((flags & CHECKPOINT_WAIT) ? ERROR : LOG,
00931 "could not signal for checkpoint: %m");
00932 break;
00933 }
00934 }

By my opinion there is not reason to retry kill call, because it fails only in
situation if process does not exist or caller does not have permission to send a
signal. If one of these situation happens it means that bgwriter is dead or
memory is corrupted. Maybe it is time for panic (or fatal)?

Zdenek

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2008-11-24 08:55:05 Re: TODO item: adding VERBOSE option to CLUSTER [with patch]
Previous Message Heikki Linnakangas 2008-11-24 08:05:06 Re: Visibility map, partial vacuums