| From: | Sami Imseih <samimseih(at)gmail(dot)com> |
|---|---|
| To: | Zsolt Parragi <zsolt(dot)parragi(at)percona(dot)com> |
| Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
| Subject: | Re: Track skipped tables during autovacuum and autoanalyze |
| Date: | 2026-05-15 16:12:36 |
| Message-ID: | CAA5RZ0sVOTPCCbFfADO1+MTn60hmijP6QTDr0V=etQU=S5v=+g@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
> > If the table is dropped, there are no stats to update. right?
>
> Ops, right. I focused too much on "all warnings should be visible in
> the statistic, so the sum of warnings and statistics should match",
> but of course that's not the case.
>
> > However, I'm not entirely sure whether this behavior is always guaranteed.
> > Could anyone clarify this?
>
> There's another different corner-case if I move the injection point
> inside pgstat_report_skipped_vacuum_analyze, after releasing the
> syscache.
>
> If the drop happens at that point, we insert an orphaned record into
> the statistics, and it will be visible with the internal functions
> (e.g. pg_stat_get_skipped_autoanalyze_count).
>
> It is still invisible in the pg_stat_all_tables view, but now that
> I've looked more at the code, I think internally it will stay there
> permanently, even surviving pg_stat_reset?
Yeah, you’re right. I just realized that pgstat_get_entry_ref_locked()
creates an entry if one does not already exist, which means we can
leak entries if the table is dropped concurrently. After releasing the
syscache entry, we still have the relid, but the table may be dropped
before the stats update runs. In that case, the stats update
recreates the relation entry, and that entry is then leaked
permanently.
The tricky part of this feature is that we cannot hold a lock while updating
the entry, because the whole point of the stat is to track cases where
we failed to acquire a lock.
So, it seems pgstat_init_function_usage() deals with the same pattern
and they deal with it by doing a second syscache lookup after creating
the entry, and if the relation is gone, drop the entry. We can do the same
thing here. See v8-0001.
I also attached a v8-0002 building on Zsolt's injection point test and his
earlier suggestion to "move the injection point inside
pgstat_report_skipped_vacuum_analyze, after releasing the syscache".
These are isolation tests that rely on injection points. The tests are:
1. Table Drop + rollback. Lock is skipped and table is dropped but rolled
back. The stats are updated.
2. Table Drop + commit. Lock is skipped and table is dropped and committed.
No table entry to record a stat for.
The tests use pg_stat_get_skipped_vacuum_count to verify that no orphaned
entry is left behind.
--
Sami
| Attachment | Content-Type | Size |
|---|---|---|
| v8-0002-Add-injection-point-test-for-vacuum-skip_locked-s.patch | application/octet-stream | 7.2 KB |
| v8-0001-Track-skipped-vacuum-and-analyze-activity-per-rel.patch | application/octet-stream | 40.7 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Peter Geoghegan | 2026-05-15 16:31:59 | Re: First draft of PG 19 release notes |
| Previous Message | Matheus Alcantara | 2026-05-15 15:46:44 | Re: postgres_fdw: use_scram_passthrough on user mapping is ignored when also set on server |