Re: n_dead_tup could be way off just after a vacuum

From: Frédéric Yhuel <frederic(dot)yhuel(at)dalibo(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Christophe Courtois <christophe(dot)courtois(at)dalibo(dot)com>
Subject: Re: n_dead_tup could be way off just after a vacuum
Date: 2026-03-12 07:47:02
Message-ID: 52ef2ec7-529e-49bd-ba78-3d7f860558b6@dalibo.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Le 11/03/2026 à 16:17, Andres Freund a écrit :
> This is just because of the small gap between the DELETE and the VACUUM. The
> stats from queries are only merged into the shared state every now and then,
> as it'd be way too expensive to do so all the time. Because you issue the
> statements in quick successing, the report of the row insertion and deletions
> are only taken into account after the VACUUM.
>
> If you put a
> SELECT pg_stat_force_next_flush();
>
> after the DELETE, you get the stats you expect:
> ┌─[ RECORD 1 ]─────┬───────────────────────────────┐
> │ relname │ foo │
> │ n_tup_ins │ 1000 │
> │ n_tup_del │ 500 │
> │ n_live_tup │ 500 │
> │ n_dead_tup │ 0 │
> │ last_autovacuum │ (null) │
> │ last_autoanalyze │ (null) │
> │ last_vacuum │ 2026-03-11 11:13:11.936146-04 │
> │ last_analyze │ (null) │
> └──────────────────┴───────────────────────────────┘
>
>
> Note that you'd*also* get the good stats if you didn't do the pg_sleep(1)
> after the VACUUM, because after the VACUUM the stats actually*are*
> accurate. It's just the stats from the DELETE are merged later.
>

Yes. I was wondering if pg_stat_report_vacuum() should clear the pending
stats... but upon further thought, that wouldn't be the right thing to
do, as long as we have more than one process running in parallel (even
if it's just one client backend and one autovac worker). Not to mention
the fact that it may be technically unfeasible or overkill.

>
> I don't think this is a bug. It's just an engineering compromise between cost
> and accuracy. In real scenarios the effect of this is much smaller, because
> either the DELETE is only a smaller portion of the rows, or it takes longer to
> run and thus will trigger a stats flush during transaction commit.

OK thank you! It makes sense indeed.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Keisuke Kuroda 2026-03-12 07:47:21 Re: JumbleQuery ma treat different GROUP BY expr as the same
Previous Message Bertrand Drouvot 2026-03-12 07:11:45 Re: Defend against -ffast-math in meson builds