Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: postgresql(at)taljaren(dot)se, pgsql-bugs(at)lists(dot)postgresql(dot)org, Peter Geoghegan <pg(at)bowt(dot)ie>
Subject: Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
Date: 2022-12-18 02:21:47
Message-ID: Y555O1zrtHOshuRC@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On Thu, Dec 15, 2022 at 01:56:30PM -0500, Tom Lane wrote:
> * To fix vacuumdb properly, it might be enough to get it to
> batch VACUUMs, say by naming up to 1000 tables per command
> instead of just one. I'm not sure how that would interact
> with its parallelization logic, though. It's not really
> solving the O(N^2) issue either, just pushing it further out.

I have been thinking about this part, and using a hardcoded rule for
the batches would be tricky. The list of relations returned by the
scan of pg_class are ordered by relpages, so depending on the
distribution of the sizes (few tables with a large size and a lot of
table with small sizes, exponential distribution of table sizes), we
may finish with more downsides than upsides in some cases, even if we
use a linear rule based on the number of relations, or even if we
distribute the relations across the slots in a round robin fashion for
example.

In order to control all that, rather than a hardcoded rule, could it
be as simple as introducing an option like vacuumdb --batch=N
defaulting to 1 to let users control the number of relations grouped
in a single command with a round robin distribution for each slot?
--
Michael

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Christophe Pettus 2022-12-18 02:23:27 Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
Previous Message David G. Johnston 2022-12-16 21:26:40 Re: BUG #17724: All autovacuum workers operate on 1 db at a time

Browse pgsql-hackers by date

  From Date Subject
Next Message Christophe Pettus 2022-12-18 02:23:27 Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
Previous Message Joseph Koshakow 2022-12-17 23:32:24 Re: Infinite Interval