Re: Reduced power consumption in autovacuum launcher process

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Cc: PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Reduced power consumption in autovacuum launcher process
Date: 2011-07-18 14:04:21
Message-ID: CA+Tgmoa5EFqmKH3t8KY98dS=1cbfbPosMXGpzOi3ZmsCG+21RA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jul 18, 2011 at 9:12 AM, Peter Geoghegan <peter(at)2ndquadrant(dot)com> wrote:
>>> Another concern is, what happens when we receive a signal, generically
>>> handled or otherwise, and have to SetLatch() to avoid time-out
>>> invalidation? Should we just live with a spurious
>>> AutoVacLauncherMain() iteration, or should we do something like check
>>> if the return value of WaitLatch indicates that we woke up due to a
>>> SetLatch() call, which must have been within a singal handler, and
>>> that we should therefore goto just before WaitLatch() and elide the
>>> spurious iteration? Given that we can expect some signals to occur
>>> relatively frequently, spurious iterations could be a real concern.
>>
>> Really?  I suspect that it doesn't much matter exactly how many
>> machine language instructions we execute on each wake-up, within
>> reasonable bounds, of course.  Maybe some testing is in order?
>
> There's only one way to get around the time-out invalidation problem
> that I'm aware of - call SetLatch() in the handler. I'd be happy to
> hear alternatives, but until we have an alternative, we're stuck
> managing this in each and every signal handler.
>
> Once we've had the latch set to handle this, and control returns to
> the auxiliary process loop, we now have to decide from within the
> auxiliary if we can figure out that all that happened was a "required"
> wake-up, and thus we shouldn't really go through with another
> iteration. That, or we can simply do the iteration.
>
> I have my doubts that it is acceptable to wake-up spuriously in
> response to routine events that there are generic handlers for. Maybe
> this needs to be decided on a case-by-case basis.

I'm confused. If the process gets hit with a signal, it's already
woken up, isn't it? Whatever system call it was blocked on may or may
not get restarted depending on the platform and what the signal
handler does, but from an OS perspective, the process has already been
allocated a time slice and will run until either the time slice is
exhausted or it again blocks.

>> On another note, I might be inclined to write something like:
>>
>> if ((return_value_of_waitlatch & WL_POSTMASTER_DEATH) && !PostmasterIsAlive())
>>   proc_exit(1);
>>
>> ...so as to avoid calling that function unnecessarily on every iteration.
>
> Hmm. I'm not so sure. We're now relying on the return value of
> WaitLatch(), which isn't guaranteed to report all wake-up events
> (although I don't believe it would be a problem in this exact case).
> Previously, we called PostmasterIsAlive() once a second anyway, and
> that wasn't much of a problem.

Ah. OK.

>>> Incidentally, should I worry about the timeout long for WaitLatch()
>>> overflowing?
>>
>> How would that happen?
>
> struct timeval is comprised of two longs - one representing seconds,
> and the other represented the balance of microseconds. Previously, we
> didn't combine them into a single microsecond representation. Now, we
> do.
>
> There could perhaps be a very large "nap", as determined by
> launcher_determine_sleep(), so that the total number of microseconds
> passed to WaitLatch() would exceed the maximum long size that can be
> safely represented on some or all platforms. On most 32-bit machines,
> sizeof(long) == sizeof(int), which is just 4 bytes. (2^31) - 1 =
> 2,147,483,647 microseconds = only about 35 minutes. There are corner
> cases, such as if someone were to set autovacuum_naptime to something
> silly.

OK. In that case, my feeling is "yes, you need to worry about that".
I'm not sure exactly what the best solution is: we could either
twiddle the WaitLatch interface some more, or restrict
autovacuum_naptime to at most 30 minutes, or maybe there's some other
option.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2011-07-18 14:35:01 Re: Reduced power consumption in autovacuum launcher process
Previous Message Florian Pflug 2011-07-18 13:19:44 Re: Reduced power consumption in autovacuum launcher process