Re: Autovacuum launcher doesn't notice death of postmaster immediately

From: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Zeugswetter Andreas ADI SD <ZeugswetterA(at)spardat(dot)at>, Andrew Hammond <andrew(dot)george(dot)hammond(at)gmail(dot)com>, "Jim C(dot) Nasby" <decibel(at)decibel(dot)org>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Autovacuum launcher doesn't notice death of postmaster immediately
Date: 2007-06-12 11:20:21
Message-ID: 466E8175.20003@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Magnus Hagander wrote:
> On Tue, Jun 12, 2007 at 12:23:50PM +0200, Zdenek Kotala wrote:
>> Alvaro Herrera wrote:
>>> Zeugswetter Andreas ADI SD escribió:
>>>>>>>>> The launcher is set up to wake up in autovacuum_naptime
>>>> seconds
>>>>>>>>> at most.
>>>>>> Imho the fix is usually to have a sleep loop.
>>>>> This is what we have. The sleep time depends on the schedule
>>>>> of next vacuum for the closest database in time. If naptime
>>>>> is high, the sleep time will be high (depending on number of
>>>>> databases needing attention).
>>>> No, I meant a "while (sleep 1(or 10) and counter < longtime) check for
>>>> exit" instead of "sleep longtime".
>>> Ah; yes, what I was proposing (or thought about proposing, not sure if I
>>> posted it or not) was putting a upper limit of 10 seconds in the sleep
>>> (bgwriter sleeps 10 seconds if configured to not do anything). Though
>>> 10 seconds may seem like an eternity for systems like the ones Peter was
>>> talking about, where there is a script trying to restart the server as
>>> soon as the postmaster dies.
>> There is also one "wild" solution. Postmaster and bgwriter will connect
>> with socket/pipe and select command will be used instead sleep. If
>> connection unexpectedly fails, select finish immediately and we are able
>> to handle this issue asap. This socket should be used also in some
>> special case when we need wake up it faster.
>
> Given the amount of problems we've had with pipes on win32, let's try to
> avoid adding extra ones unless they're really necessary. If split-sleep
> works, that seems a safer bet.

Ok It should be problem. But I'm afraid split-sleep is not good solution
as well. It should generate a lot of race condition in start/stop
scripts and monitoring tools. Much better should be improve pg_ctl to
perform clean up ("pg_ctl cleanup) when postmaster fails.

I think we must offer deterministic way to packagers integrator how to
handle this issue.

Zdenek

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavan Deolasee 2007-06-12 11:38:38 comparing index columns
Previous Message Dave Page 2007-06-12 11:18:39 Re: Selecting a constant question

Browse pgsql-patches by date

  From Date Subject
Next Message Gregory Stark 2007-06-12 11:34:17 Two aesthetic bugs in the 1-byte packed varlena code
Previous Message Magnus Hagander 2007-06-12 11:16:32 Re: Regression tests