Re: RFE: Make statistics robust for unplanned events

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Patrik Novotny <panovotn(at)redhat(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: RFE: Make statistics robust for unplanned events
Date: 2021-04-21 15:04:42
Message-ID: CABUevEw=NphkS7h4rKpQhzKJiHBP_W8igr4_-a4NE00pUAbzbg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 21, 2021 at 5:02 PM Tomas Vondra
<tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>
>
>
> On 4/21/21 2:38 PM, Magnus Hagander wrote:
> > On Tue, Apr 20, 2021 at 2:00 PM Patrik Novotny <panovotn(at)redhat(dot)com> wrote:
> >>
> >> Hello PostgreSQL Hackers,
> >>
> >> is it possible to preserve the PostgreSQL statistics on a server crash?
> >>
> >> Steps to reproduce the behaviour:
> >> 1) Observe the statistics counters, take note
> >> 2) Crash the machine, e.g. with sysrq; perhaps kill -9 on postgresql will already suffice
> >> 3) After recovery, observe the statistics counter again. Have they been reset to zero (Bad) or are they preserved (Good).
> >>
> >> Resetting the counters to zero harms execution planning and auto_vacuum
> >> operations. That can cause growth of database as dead tuples are not removed
> >> at the right time. In the end the database can go offline if autovacuum never runs.
> >
> > The stats for the planner are store persistently in pg_stats though,
> > but autovacuum definitely takes a hit from it, and several other
> > things can too.
> >
> >> As far as I've checked, this would have to be implemented.
> >>
>
> I think the problem with planner stats is that after reset of the
> runtime stats we lose info about which tables may need analyze etc. and
> then fail to run ANALYZE in time. Which may have negative impact on
> performance, of course.
>
> >> My question would be whether there is something that would make
> >> this impossible to implement, and if there isn't, I'd like this to
> >> be considered a feature request.
> >
> > I'm pretty sure everybody would *want* this. At least nobody would be
> > against it. The problem is the potential performance cost of it.
> >
> > Andres mentioned at least once over in the thread about shared memory
> > stats collection that being able to have persistent stats could come
> > out of that one in the future. Whatever is done on the topic should
> > probably be done based on that work, as it provides a better starting
> > point and also one that will stay around.
> >
>
> Right. I think the other question is how often does this happen in
> practice - if your instance crashes often enough to make this an issue,
> then there are probably bigger issues.

Agreed.

I think the bigger problem there is replication failover, but that's
also a different issue (keeping the statistics from the *standby*
wouldn't help you much there, you'd need to replicate it from the
primary).

--
Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2021-04-21 15:13:17 Re: track_planning causing performance regression
Previous Message Tomas Vondra 2021-04-21 15:02:05 Re: RFE: Make statistics robust for unplanned events