From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Peter Geoghegan <pg(at)bowt(dot)ie>, postgresql(at)taljaren(dot)se, pgsql-hackers(at)lists(dot)postgresql(dot)org, Michael Paquier <michael(at)paquier(dot)xyz> |
Subject: | Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue) |
Date: | 2022-12-29 03:03:29 |
Message-ID: | 20221229030329.fbpiitatmowzza6c@awork3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs pgsql-hackers |
Hi,
On 2022-12-28 15:13:23 -0500, Tom Lane wrote:
> [ redirecting to -hackers because patch attached ]
>
> Peter Geoghegan <pg(at)bowt(dot)ie> writes:
> > On Fri, Dec 16, 2022 at 6:49 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> That is a really good point. How about teaching VACUUM to track
> >> the oldest original relfrozenxid and relminmxid among the table(s)
> >> it processed, and skip vac_update_datfrozenxid unless at least one
> >> of those matches the database's values? For extra credit, also
> >> skip if we didn't successfully advance the source rel's value.
>
> > Hmm. I think that that would probably work.
>
> I poked into that idea some more and concluded that getting VACUUM to
> manage it behind the user's back is not going to work very reliably.
> The key problem is explained by this existing comment in autovacuum.c:
>
> * Even if we didn't vacuum anything, it may still be important to do
> * this, because one indirect effect of vac_update_datfrozenxid() is to
> * update ShmemVariableCache->xidVacLimit. That might need to be done
> * even if we haven't vacuumed anything, because relations with older
> * relfrozenxid values or other databases with older datfrozenxid values
> * might have been dropped, allowing xidVacLimit to advance.
>
> That is, if the table that's holding back datfrozenxid gets dropped
> between VACUUM runs, VACUUM would never think that it might have
> advanced the global minimum.
I wonder if a less aggressive version of this idea might still work. Perhaps
we could use ShmemVariableCache->latestCompletedXid or
ShmemVariableCache->nextXid to skip at least some updates?
Obviously this isn't going to help if there's a lot of concurrent activity,
but the case of just running vacuumdb -a might be substantially improved.
Separately I wonder if it's worth micro-optimizing vac_update_datfrozenxid() a
bit. I e.g. see a noticable speedup bypassing systable_getnext() and using
heap_getnext(). It's really too bad that we want to check for "in the future"
xids, otherwise we could use a ScanKey to filter at a lower level.
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Floris Van Nee | 2022-12-29 10:58:35 | postgres reorders expressions when inlining |
Previous Message | Tom Lane | 2022-12-29 02:17:54 | Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue) |
From | Date | Subject | |
---|---|---|---|
Next Message | Peifeng Qiu | 2022-12-29 03:18:33 | psql: stop at error immediately during \copy |
Previous Message | Tom Lane | 2022-12-29 02:17:54 | Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue) |