Re: Reducing power consumption on idle servers

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(dot)riggs(at)enterprisedb(dot)com>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Zheng Li <zhengli10(at)gmail(dot)com>, Jim Nasby <nasbyj(at)amazon(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Reducing power consumption on idle servers
Date: 2022-03-24 15:38:52
Message-ID: CA+TgmoYGKbm0gumAmrwE3ssSouS5knrceQdUccEyC6wyqd5oxw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 24, 2022 at 6:59 AM Simon Riggs
<simon(dot)riggs(at)enterprisedb(dot)com> wrote:
> The proposals of this patch are the following, each of which can be
> independently accepted/rejected:
> 1. fix the sleep pattern of bgwriter, walwriter and logical worker
> (directly affects powersave)
> 2. deprecate promote_trigger_file, which then allows us to fix the
> sleep for startup process (directly affects powersave)
> 3. treat hibernation in all procs the same, for simplicity, and to
> make sure we don't forget one later
> 4. provide a design pattern for background worker extensions to
> follow, so as to encourage powersaving

Unfortunately, the patch isn't split in a way that corresponds to this
division. I think I'm +1 on change #2 -- deprecating
promote_trigger_file seems like a good idea to me independently of any
power-saving considerations. But I'm not sure that I am on board with
any of the other changes. I do agree with the basic goal of trying to
reduce unnecessary wakeups, but I think the rest of the patch is
taking a bit of a one-size-fits-all approach where I think that we
might want to be more nuanced. I think there are a couple of different
kinds of cases here.

In some places, like DetermineSleepTime(), we're already willing to
sleep for pretty long periods of time, like a minute. I think in those
cases we could consider doing nothing, on the theory that one wakeup
per minute is already not very much. If we do want to do something, I
think it should be in the direction of convincing ourselves that we
don't need a timeout at all. Having a timeout is a bit like insurance:
it guarantees that if for some reason the event by which we expect to
be awoken doesn't actually wake us up even though something meaningful
has happened, we won't hang forever. But if we think a wakeup per
minute is meaningful and we don't think there's any possibility of a
missed wakeup, let's just wait forever. I don't think we'll avoid many
user complaints by recovering from a missed wakeup in just under an
hour.

In other places, like WalWriterMain, we're basically polling. One
question in these kinds of cases is whether we can avoid polling in
favor of having some other process wake us up if the event that we
care about happens. This is unlikely to be practical in all cases -
e.g. we can't wait for a promotion trigger file to show up unless we
want to use inotify or something like that. However, we may be able to
avoid polling in some instances. When we can't, then I think it makes
sense to increase the wait time when the system appears to be idle. In
that subset of cases - we're polling and we can't avoid polling - this
kind of approach seems fairly reasonable.

I do have some concerns about the idea that the right strategy in
general, or even in particular cases, is to multiply by 50. This patch
hasn't invented that problem; it's already there. My concern is that
multiplying a small number by 50 seems very different than multiplying
a large number by 50. If we normally wait for 100ms before checking
for something to happen, and we wait for 5s, it's probably not going
to be a huge issue, because 5s is still a short amount of time for
human beings. If we normally wait for a minute before checking for
something to happen and we wait for 50 minutes, that's much more
likely to make a human being unhappy. Therefore, it's very unclear to
me that those cases should be treated the same way.

There's a more technical issue with this strategy, too: if we multiply
both short and long timeouts by 50, I think we are going to get pretty
much the same result as if we only multiply the short timeouts by 50.
Why even bother reducing the frequency of wakeups from minutes to
hours if we're elsewhere reducing the frequency from seconds to
minutes? If process A is still waking up every minute, putting process
B in the deep freeze isn't going to do a whole lot in terms of letting
the system go into any kind of deeper sleep.

All in all I feel that encouraging developers to adopt a
multiply-by-50 rule in all cases seems too simplistic to me. It may be
better than what we're doing right now, but it doesn't really seem
like the right policy.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2022-03-24 15:40:00 Re: [RFC] building postgres with meson -v6
Previous Message Jacob Champion 2022-03-24 15:36:51 Re: [PATCH] Accept IP addresses in server certificate SANs