Re: Why is hot_standby_feedback off by default?

From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org, Vik Fearing <vik(at)postgresfriends(dot)org>, sirisha chamarthi <sirichamarthi22(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Why is hot_standby_feedback off by default?
Date: 2023-10-22 19:07:59
Message-ID: 8800FAF9-22B0-46AC-817D-9DD6FA08A45D@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On October 22, 2023 4:56:15 AM PDT, Vik Fearing <vik(at)postgresfriends(dot)org> wrote:
>On 10/22/23 09:50, sirisha chamarthi wrote:
>> Is there any specific reason hot_standby_feedback default is set to off?
>
>
>Yes. No one wants a rogue standby to ruin production.

Medium term, I think we need an approximate xid->"time of assignment" mapping that's continually maintained on the primary. One of the things that'd show us to do is introduce a GUC to control the maximum effect of hs_feedback on the primary, in a useful unit. Numbers of xids are not a useful unit (100k xids is forever on some systems, a few minutes at best on others, the rate is not necessarily that steady when plpgsql exception handles are used, ...)

It'd be useful to have such a mapping for other features too. E.g.

- making it visible in pg_stat _activity how problematic a longrunning xact is - a 3 day old xact that doesn't have an xid assigned and has a recent xmin is fine, it won't prevent vacuum from doing things. But a somewhat recent xact that still has a snapshot from before an old xact was cancelled could be problematic.

- turn pg_class.relfrozenxid into an understandable timeframe. It's a fair bit of mental effort to classify "370M xids old" into problem/fine (it's e.g. not a problem on a system with a high xid rate, on a big table that takes a bit to a bit to vacuum).

- using the mapping to compute an xid consumption rate IMO would be one building block for smarter AV scheduling. Together with historical vacuum runtimes it'd allow us to start vacuuming early enough to prevent hitting thresholds, adapt pacing, prioritize between tables etc.

Greetings,

Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2023-10-22 19:23:30 Re: Guiding principle for dropping LLVM versions?
Previous Message Bharath Rupireddy 2023-10-22 18:29:00 Re: Remove extraneous break condition in logical slot advance function