Re: Auto-vacuum is not running in 9.1.12

From: Prakash Itnal <prakash074(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, rasna(dot)t(at)nokia(dot)com, sandhya(dot)k_s(at)nokia(dot)com, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Auto-vacuum is not running in 9.1.12
Date: 2015-06-21 09:26:26
Message-ID: CAHC5u78Bi1n=MjS-3kuUar3WU0bpgS3UrevsH10ibbs=DodH5w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

To my understanding it will probably not open doors for worst situations!
Please correct if my below understanding is correct.

The latch will wake up under below three situations:
a) Socket error (=> result is set to negative number)
b) timeout (=> result is set to TIMEOUT)
c) some event arrived on socket (=> result is set to non-zero value, if
caller registers for arrived events otherwise no value is set)

Given the above conditions, the result can be zero only if there is an
unregistered event which breaks the latch (*). In such case, current
implementation evaluates the remaining sleep time. This calculation is
making the situation worst, if time goes back.

The time difference between cur_time (current time) and start_time (time
when latch started) should always be a positive integer because cur_time is
always greater than start_time under all normal conditions.

delta_timeout = cur_time - start_time;

The difference can be negative only if time shifts to past. So it is
possible to detect if time shifted to past. When it is possible to detect
can it be possible to correct? I think we can correct and prevent long
sleeps due to time shifts.

Currently I treat it as TIMEOUT, though conceptually it is not. The ideal
solution would be to leave this decision to the caller of WaitLatch(). With
my little knowledge of postgres code, I think TIMEOUT would be fine!

(*) The above description is true only for timed wait. If latch is started
with blocking wait (no timeout) then above logic is not applicable.

On Sat, Jun 20, 2015 at 10:01 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Prakash Itnal <prakash074(at)gmail(dot)com> writes:
> > Sorry for the late response. The current patch only fixes the scenario-1
> > listed below. It will not address the scenario-2. Also we need a fix in
> > unix_latch.c where the remaining sleep time is evaluated, if latch is
> woken
> > by other events (or result=0). Here to it is possible the latch might go
> in
> > long sleep if time shifts to past time.
>
> Forcing WL_TIMEOUT if the clock goes backwards seems like quite a bad
> idea to me. That seems like a great way to make a bad situation worse,
> ie it induces failures where there were none before.
>
> regards, tom lane
>

--
Cheers,
Prakash

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dean Rasheed 2015-06-21 09:45:59 Re: Inheritance planner CPU and memory usage change since 9.3.2
Previous Message Fabien COELHO 2015-06-21 08:12:07 Re: pgbench - allow backslash-continuations in custom scripts