Re: Reducing power consumption on idle servers

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Simon Riggs <simon(dot)riggs(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Zheng Li <zhengli10(at)gmail(dot)com>, Jim Nasby <nasbyj(at)amazon(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Reducing power consumption on idle servers
Date: 2023-01-27 06:37:30
Message-ID: CALj2ACVwjbwR9_u8GSZuAkUFrWf1rMJJGRrGBoBthq_NBsL_pw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 25, 2023 at 2:10 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Thomas Munro <thomas(dot)munro(at)gmail(dot)com> writes:
> > Yeah, I definitely want to fix it. I just worry that 60s is so long
> > that it also needs that analysis work to be done to explain that it's
> > OK that we're a bit sloppy on noticing when to wake up, at which point
> > you might as well go to infinity.
>
> Yeah. The perfectionist in me wants to say that there should be
> explicit wakeups for every event of interest, in which case there's
> no need for a timeout. The engineer in me says "but what about bugs?".
> Better a slow reaction than never reacting at all. OTOH, then you
> have to have a discussion about whether 60s (or any other
> ice-cap-friendly value) is an acceptable response time even in the
> presence of bugs.
>
> It's kind of moot until we've reached the point where we can
> credibly claim to have explicit wakeups for every event of
> interest. I don't think we're very close to that today, and
> I do think we should try to get closer. There may come a point
> of diminishing returns though.

IIUC, we're discussing here whether or not to get rid of hibernate
loops, IOW, sleep-wakeup-doworkifthereisany-sleep loops and rely on
other processes' wakeup signals to reduce the overall power
consumption, am I right?

I'm trying to understand this a bit - can the signals (especially,
SIGURG that we use to set latches to wake up processes) ever get lost
on the way before reaching the target process? If yes, how? How
frequently can it happen? Is there any history of reported issues in
postgres because a signal got lost?

I'm reading about Pending Signals and queuing of signals with
sigqueue() (in linux), can any of these guarantee that signals sent
never get lost?

FWIW, a recent commit cd4329d that removed promote_trigger_file also
removed hibernation and added wait forever relying on the latch set.
If we're worried that the signals can get lost on the way, then this
also needs to be fixed. And, I also see lot of
WaitLatch() with waiting forever relying on others to wake them up.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2023-01-27 06:46:04 Re: Improve WALRead() to suck data directly from WAL buffers when possible
Previous Message Andres Freund 2023-01-27 06:17:45 Re: Improve WALRead() to suck data directly from WAL buffers when possible