Re: Does pg_stat_get_live_tuples() matter?

From: Sherrylyn Branchaw <sbranchaw(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: Does pg_stat_get_live_tuples() matter?
Date: 2019-04-10 15:50:30
Message-ID: CAB_myF4sZpxNXdb-x=weLpqBDou6uE8FHtM0FVerPM-1J7phkw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> Hmmm ... what was in reltuples again?

Reltuples had the correct number, and the query plans were showing the
correct estimates.

> it's a bit hard to believe that it could get to be off by 1000X. Have
you suppressed autovacuum on this table?

No, but here are some things I've observed:

1) Any time pg_stat_reset() gets run, pg_stat_user_tables.n_live_tup gets
reset to 0. pg_class.reltuples is untouched.

2) If new tuples get inserted or deleted after pg_stat_reset(),
pg_stat_user_tables.n_live_tup will match pg_stat_user_tables.n_tup_ins -
pg_stat_user_tables.n_tup_del up until the next analyze or autoanalyze.

3) Once the next (auto)analyze happens, pg_stat_user_tables.n_live_tup will
be updated to match pg_class.reltuples.

4) If a table is very large, it may be a while before
autovacuum_analyze_scale_factor is reached. We have ours set to 0.01, which
is an order of magnitude larger than 1.8K on a 1.8M row table. (I would
like to tune our settings more, but I have a list of priorities from
management on which this isn't high.)

5) Given 1-4, pg_stat_user_tables.n_live_tup may spend quite a long time
matching pg_stat_user_tables.n_tup_ins - pg_stat_user_tables.n_tup_del
instead of pg_class.reltuples. For example, if a table has 1.8 million
rows, and you insert 5 and delete 4 after a stats reset, n_live_tup will
report that the table has 1 tuple.

6) Failovers, at least in Aurora, apparently cause pg_stat_reset() to be
run, at least judging by the timestamp I'm seeing in
pg_stat_bgwriter.stat_reset. We haven't done a failover in the data center
in a while, and it's less trivial for me to test there atm, so I'm not
certain whether open-source Postgres failovers also reset statistics.

> I don't see anything in the current core code that pays attention to
n_live_tuples. reltuples definitely does matter to the planner, and some
of the sibling counters like n_dead_tuples drive autovacuum, but nothing is
examining n_live_tuples AFAICS.

That's what I thought, but I wanted to make sure I wasn't missing anything
obvious. Thanks!

> some of the sibling counters like n_dead_tuples drive autovacuum

So that's interesting. I knew that, but hadn't thought about the
implications. If pg_stat_reset() is executed by failovers, and a failover
happens just before a table is ready to be vacuumed--say it has 0.009 *
reltuples dead tuples (I'm simplifying the formula here)--then n_dead_tup
gets reset to 0 and the counting starts all over again. Regular failovers
could thus increase bloat by delaying the autovacuum daemon from
recognizing that a table needs to be vacuumed, am I right?

Is it recommended practice to manually VACUUM ANALYZE the whole database
after a failover? Or is resetting stats after a failover just an Aurora
thing? I'm sorry I'm asking the latter question instead of testing, but
I've been ordered not to spend time on improving our vacuuming and
statistics until 5 other large projects are done, and I'm spending a
minimal amount of time anyway just to see how badly frequent failovers
might be affecting us and if there's any action we need to take.

Thanks,
Sherrylyn

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message David Rowley 2019-04-10 15:57:49 Re: pg_indexes doesn't show indexes for partitioned tables - bug or intended?
Previous Message Олег Самойлов 2019-04-10 15:19:53 Re: Invoking user of the function with SECURITY DEFINER