Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Geoghegan <pg(at)bowt(dot)ie>, pgsql-hackers(at)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode
Date: 2023-04-05 19:25:52
Message-ID: CAAKRu_ZX5hc5LB+h-UoO=ab9diP-KqnXyc=-3oNu+wvR98b9Ug@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

v11 attached with updates detailed below.

On Tue, Apr 4, 2023 at 11:14 PM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
>
> On Wed, 5 Apr 2023 at 05:53, Melanie Plageman <melanieplageman(at)gmail(dot)com> wrote:
> > Attached v10 addresses the review feedback below.
>
> Thanks. Here's another round on v10-0001:
>
> 1. The following documentation fragment does not seem to be aligned
> with the code:
>
> <literal>16 GB</literal>. The minimum size is the lesser
> of 1/8 the size of shared buffers and <literal>128 KB</literal>. The
> default value is <literal>-1</literal>. If this value is specified
>
> The relevant code is:
>
> static inline int
> StrategyGetClampedBufsize(int bufsize_kb)
> {
> int sb_limit_kb;
> int blcksz_kb = BLCKSZ / 1024;
>
> Assert(blcksz_kb > 0);
>
> sb_limit_kb = NBuffers / 8 * blcksz_kb;
>
> return Min(sb_limit_kb, bufsize_kb);
> }
>
> The code seems to mean that the *maximum* is the lesser of 16GB and
> shared_buffers / 8. You're saying it's the minimum.

Good catch. Fixed.

> 2. I think you could get rid of the double "Buffer Access Stategy" in:
>
> <glossterm linkend="glossary-buffer-access-strategy">Buffer
> Access Strategy</glossterm>.
> <literal>0</literal> will disable use of a <literal>Buffer
> Access Strategy</literal>.
> <literal>-1</literal> will set the size to a default of
> <literal>256 KB</literal>. The maximum size is
>
> how about:
>
> <glossterm linkend="glossary-buffer-access-strategy">Buffer
> Access Strategy</glossterm>.
> A setting of <literal>0</literal> will allow the operation to use any
> number of <varname>shared_buffers</varname>, whereas
> <literal>-1</literal> will set the size to a default of
> <literal>256 KB</literal>. The maximum size is

I've made these changes.

> 3. In the following snippet you can use <xref linkend="sql-vacuum"/>
> or just <command>VACUUM</command>. There are examples of both in that
> file. I don't have a preference as it which, but I think what you've
> got isn't great.
>
> <link linkend="sql-vacuum"><command>VACUUM</command></link> and
> <link linkend="sql-analyze"><command>ANALYZE</command></link>

I've updated it to use the link. I thought it would be nice to have the
link in case the reader wants to look at the BUFFER_USAGE_LIMIT option
docs there.

> 4. I wonder if there's a reason this needs to be written in the
> overview of ANALYZE.
>
> <command>ANALYZE</command> uses a
> <glossterm linkend="glossary-buffer-access-strategy">Buffer Access
> Strategy</glossterm>
> when reading in the sample data. The number of buffers consumed for this can
> be controlled by <xref linkend="guc-vacuum-buffer-usage-limit"/> or by using
> the <option>BUFFER_USAGE_LIMIT</option> option.
>
> I think it's fine just to mention it under BUFFER_USAGE_LIMIT. It just
> does not seem fundamental enough to be worth being upfront about it.
> The other things mentioned in that section don't seem related to
> parameters, so there might be no better place for those to go. That's
> not the case for what you're adding here.

I updated this.

> 5. I think I'd rather see the details spelt out here rather than
> telling the readers to look at what VACUUM does:
>
> Specifies the
> <glossterm linkend="glossary-buffer-access-strategy">Buffer
> Access Strategy</glossterm>
> ring buffer size for <command>ANALYZE</command>. See the
> <link linkend="sql-vacuum"><command>VACUUM</command></link> option with
> the same name.

I've updated it to contain the same text, as relevant, as the VACUUM
option contains. Note that both rely on the vacuum_buffer_usage_limit
GUC documentation for a description of upper and lower bounds.

> 6. When I asked about restricting the valid values of
> vacuum_buffer_usage_limit to -1 / 0 or 128 KB to 16GB, I didn't expect
> you to code in the NBuffers / 8 check. We shouldn't chain
> dependencies between GUCs like that. Imagine someone editing their
> postgresql.conf after realising shared_buffers is too large for their
> RAM, they reduce it and restart. The database fails to start because
> vacuum_buffer_usage_limit is too large! Angry DBA?
>
> Take what's already written about vacuum_failsafe_age as your guidance on this:
>
> "The default is 1.6 billion transactions. Although users can set this
> value anywhere from zero to 2.1 billion, VACUUM will silently adjust
> the effective value to no less than 105% of
> autovacuum_freeze_max_age."
>
> Here we just document the silent capping. You can still claim the
> valid range is 128KB to 16GB in the docs. You can mention the 1/8th of
> shared buffers cap same as what's mentioned about "105%" above.
>
> When I mentioned #4 and #10 in my review of the v9-0001 patch, I just
> wanted to not surprise users who do vacuum_buffer_usage_limit = 64 and
> magically get 128.

> 7. In ExecVacuum(), similar to #6 from above, it's also not great that
> you're raising an ERROR based on if StrategyGetClampedBufsize() clamps
> or not. If someone has a script that does:
>
> VACUUM (BUFFER_USAGE_LIMIT '1 GB'); it might be annoying that it stops
> working when someone adjusts shared buffers from 10GB to 6GB.
>
> I really think the NBuffers / 8 clamping just should be done inside
> GetAccessStrategyWithSize().

Got it. I've done what you suggested.
I had some logic issues as well that I fixed and reorderd the code.

> 8. I think this ERROR in vacuum.c should mention that 0 is a valid value.
>
> ereport(ERROR,
> (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> errmsg("buffer_usage_limit for a vacuum must be between %d KB and %d KB",
> MIN_BAS_RING_SIZE_KB, MAX_BAS_RING_SIZE_KB)));
>
> I doubt there's a need to mention -1 as that's the same as not
> specifying BUFFER_USAGE_LIMIT.

I've done this. I didn't say that 0 meant disabling the strategy. Do you
think that would be useful?

> 9. The following might be worthy of a comment explaining the order of
> precedence of how we choose the size:
>
> if (params->ring_size == -1)
> {
> if (VacuumBufferUsageLimit == -1)
> bstrategy = GetAccessStrategy(BAS_VACUUM);
> else
> bstrategy = GetAccessStrategyWithSize(BAS_VACUUM, VacuumBufferUsageLimit);
> }
> else
> bstrategy = GetAccessStrategyWithSize(BAS_VACUUM, params->ring_size);

I've updated this. Also, after doing so, I realized the if/else logic
here and in ExecVacuum() could be better and updated the ordering to
more closely mirror the human readable logic.

> 10. I wonder if you need to keep bufsize_limit_to_nbuffers(). It's
> just used once and seems trivial enough just to write that code inside
> GetAccessStrategyWithSize().

I've gotten rid of it.

> 11. It's probably worth putting the valid range in the sample config:
>
> #vacuum_buffer_usage_limit = -1 # size of vacuum and analyze buffer
> access strategy ring.
> # -1 to use default,
> # 0 to disable vacuum buffer access strategy
> # > 0 to specify size <-- here

Done.

> 12. Is bufmgr.h the right place for these?
>
> /*
> * Upper and lower hard limits for the Buffer Access Strategy ring size
> * specified by the vacuum_buffer_usage_limit GUC and BUFFER_USAGE_LIMIT option
> * to VACUUM and ANALYZE.
> */
> #define MAX_BAS_RING_SIZE_KB (16 * 1024 * 1024)
> #define MIN_BAS_RING_SIZE_KB 128
>
> Your comment implies they're VACUUM / ANALYZE limits. If we want to
> impose these limits to all access strategies then these seem like good
> names and location, otherwise, I imagine miscadmin.h is the correct
> place. If so, they'll likely want to be renamed to something more
> VACUUM specific. I don't particularly have a preference. 128 -
> 1677216 seem like reasonable limits for any buffer access strategy.

I don't assert on these limits in GetAccessStrategyWithSize(), and,
since the rest of the code is mainly only concerned with vacuum, I think
it is better to make these limits vacuum-specific. If we decide to make
other access strategies configurable, we can generalize these macros
then. As such, I have moved them into miscadmin.h.

> 13. I think check_vacuum_buffer_usage_limit() does not belong in
> freelist.c. Maybe vacuum.c?

I've moved it to vacuum.c. I put it above ExecVacuum() since that would
be correct alphabetically, but I'm not sure if it would be better to
move it down since ExecVacuum() is the "main entry point".

> 14. Not related to this patch, but why do we have half the vacuum
> related GUCs in vacuum.c and the other half in globals.c? I see
> vacuum_freeze_table_age is defined in vacuum.c but is also needed in
> autovacuum.c, so that rules out the globals.c ones being for vacuum.c
> and autovacuum.c. It seems a bit messy. I'm not really sure where
> VacuumBufferUsageLimit should go now.

I've left it where it is and added a (helpful?) comment.

> > Remaining TODOs:
> > - tests
> > - do something about config reload changing GUC
>
> Shouldn't table_recheck_autovac() pfree/palloc a new strategy if the
> size changes?

See thoughts about that below in response to Andres' mail.

> I'm not sure what the implications are with that and the other patch
> you're working on to allow vacuum config changes mid-vacuum. We'll
> need to be careful and not immediately break that if that gets
> committed then this does or vice-versa.

We can think hard about this. If we go with adding a TODO for the size,
and keeping the same ring, it won't be a problem.

On Wed, Apr 5, 2023 at 1:05 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> On 2023-04-04 13:53:15 -0400, Melanie Plageman wrote:
> > > 8. I don't quite follow this comment:
> > >
> > > /*
> > > * TODO: should this be 0 so that we are sure that vacuum() never
> > > * allocates a new bstrategy for us, even if we pass in NULL for that
> > > * parameter? maybe could change how failsafe NULLs out bstrategy if
> > > * so?
> > > */
> > >
> > > Can you explain under what circumstances would vacuum() allocate a
> > > bstrategy when do_autovacuum() would not? Is this a case of a config
> > > reload where someone changes vacuum_buffer_usage_limit from 0 to
> > > something non-zero? If so, perhaps do_autovacuum() needs to detect
> > > this and allocate a strategy rather than having vacuum() do it once
> > > per table (wastefully).
>
> Hm. I don't much like that we use a single strategy for multiple tables
> today. That way even tiny tables never end up in shared_buffers. But that's
> really a discussion for a different thread. However, if were to use a
> per-table bstrategy, it'd be a lot easier to react to changes of the config.

Agreed. I was wondering if it is okay to do the palloc()/pfree() for
every table given that some may be small.

> I doubt it's worth adding complications to the code for changing the size of
> the ringbuffer during an ongoing vacuum scan, at least for 16. Reacting to
> enabling/disbling the ringbuffer alltogether seems a bit more important, but
> still not crucial compared to making it configurable at all.
>
> I think it'd be OK to add a comment saying something like "XXX: In the future
> we might want to react to configuration changes of the ring buffer size during
> a vacuum" or such.

I've added the XXX to the autovacuum code. I think you mean it also
could be considered for VACUUM, but I've refrained from mentioning that
for now.

> WRT to the TODO specifically: Yes, passing in 0 seems to make sense. I don't
> see a reason not to do so? But perhaps there's a better solution:

I've done that (passed in a 0), I was concerned that future code may
reference this vacuum param and expect it to be aligned with the Buffer
Access Strategy in use. Really only vacuum() should be referencing the
params, so, perhaps it is not an issue...

Okay, now I've convinced myself that it is better to allocate the
strategy in ExecVacuum(). Then we can get rid of the
VacuumParams->ring_size altogether.

I haven't done that in this version because of the below concern (re: it
being appropriate to allocate the strategy in ExecVacuum() given its
current concern/focus).

> Perhaps the best solution for autovac vs interactive vacuum issue would be to
> move the allocation of the bstrategy to ExecVacuum()?

Thought about this -- I did think it might be a bit weird since
ExecVacuum() mainly does option handling and sanity checking. Doing
Buffer Access Strategy allocation seemed a bit out of place. I've left
it as is, but would be happy to change it if the consensus is that this
is better.

> Random note while looking at the code:
> ISTM that adding handling of -1 in GetAccessStrategyWithSize() would make the
> code more readable. Instead of
>
> if (params->ring_size == -1)
> {
> if (VacuumBufferUsageLimit == -1)
> bstrategy = GetAccessStrategy(BAS_VACUUM);
> else
> bstrategy = GetAccessStrategyWithSize(BAS_VACUUM, VacuumBufferUsageLimit);
> }
> else
> bstrategy = GetAccessStrategyWithSize(BAS_VACUUM, params->ring_size);
>
> you could just have something like:
> bstrategy = GetAccessStrategyWithSize(BAS_VACUUM,
> params->ring_size != -1 ? params->ring_size : VacuumBufferUsageLimit);
>
> by falling back to the default values from GetAccessStrategy().
>
> Or even more "extremely", you could entirely remove references to
> VacuumBufferUsageLimit and handle that in freelist.c

Hmm. I see what you mean.

I've updated it to this, which is a bit better.

if (params->ring_size > -1)
bstrategy = GetAccessStrategyWithSize(BAS_VACUUM,
params->ring_size);
else if (VacuumBufferUsageLimit > -1)
bstrategy = GetAccessStrategyWithSize(BAS_VACUUM,
VacuumBufferUsageLimit);
else
bstrategy = GetAccessStrategy(BAS_VACUUM);

Not referencing VacuumBufferUsageLimit except in freelist.c is more
challenging because I think it would be weird to have
GetAccessStrategyWithSize() call GetAccessStrategy() which then calls
GetAccessStrategyWithSize().

- Melanie

Attachment Content-Type Size
v11-0002-Add-buffer-usage-limit-option-to-vacuumdb.patch text/x-patch 5.3 KB
v11-0001-Add-VACUUM-BUFFER_USAGE_LIMIT-option-and-GUC.patch text/x-patch 25.3 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2023-04-05 19:35:05 Re: monitoring usage count distribution
Previous Message Andres Freund 2023-04-05 19:09:21 Re: monitoring usage count distribution