Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: postgresql(at)taljaren(dot)se, pgsql-hackers(at)lists(dot)postgresql(dot)org, Michael Paquier <michael(at)paquier(dot)xyz>
Subject: Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
Date: 2022-12-28 20:13:23
Message-ID: 407579.1672258403@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

[ redirecting to -hackers because patch attached ]

Peter Geoghegan <pg(at)bowt(dot)ie> writes:
> On Fri, Dec 16, 2022 at 6:49 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> That is a really good point. How about teaching VACUUM to track
>> the oldest original relfrozenxid and relminmxid among the table(s)
>> it processed, and skip vac_update_datfrozenxid unless at least one
>> of those matches the database's values? For extra credit, also
>> skip if we didn't successfully advance the source rel's value.

> Hmm. I think that that would probably work.

I poked into that idea some more and concluded that getting VACUUM to
manage it behind the user's back is not going to work very reliably.
The key problem is explained by this existing comment in autovacuum.c:

* Even if we didn't vacuum anything, it may still be important to do
* this, because one indirect effect of vac_update_datfrozenxid() is to
* update ShmemVariableCache->xidVacLimit. That might need to be done
* even if we haven't vacuumed anything, because relations with older
* relfrozenxid values or other databases with older datfrozenxid values
* might have been dropped, allowing xidVacLimit to advance.

That is, if the table that's holding back datfrozenxid gets dropped
between VACUUM runs, VACUUM would never think that it might have
advanced the global minimum.

I'm forced to the conclusion that we have to expose some VACUUM
options if we want this to work well. Attached is a draft patch
that invents SKIP_DATABASE_STATS and ONLY_DATABASE_STATS options
(name bikeshedding welcome) and teaches vacuumdb to use them.

Light testing says that this is a win: even on the regression
database, which isn't all that big, I see a drop in vacuumdb's
runtime from ~260 ms to ~175 ms. Of course this is a case where
VACUUM doesn't really have anything to do, so it's a best-case
scenario ... but still, I was expecting the effect to be barely
above noise with this many tables, yet it's a good bit more.

regards, tom lane

Attachment Content-Type Size
allow-vacuum-to-skip-stats-update-1.patch text/x-diff 15.5 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Nathan Bossart 2022-12-28 21:12:53 Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
Previous Message David G. Johnston 2022-12-28 17:08:26 BUG #17732: pg_restore fails with check constraint

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2022-12-28 21:12:53 Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
Previous Message Tom Lane 2022-12-28 18:43:27 Re: perl 5.36, C99, -Wdeclaration-after-statement -Wshadow=compatible-local