Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Geoghegan <pg(at)bowt(dot)ie>, pgsql-hackers(at)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode
Date: 2023-04-05 03:14:17
Message-ID: CAApHDvqC__T_7J6dNrjUWCHpxwdVPbqFF45fQG7wAagZZ+cm=w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 5 Apr 2023 at 05:53, Melanie Plageman <melanieplageman(at)gmail(dot)com> wrote:
> Attached v10 addresses the review feedback below.

Thanks. Here's another round on v10-0001:

1. The following documentation fragment does not seem to be aligned
with the code:

<literal>16 GB</literal>. The minimum size is the lesser
of 1/8 the size of shared buffers and <literal>128 KB</literal>. The
default value is <literal>-1</literal>. If this value is specified

The relevant code is:

static inline int
StrategyGetClampedBufsize(int bufsize_kb)
{
int sb_limit_kb;
int blcksz_kb = BLCKSZ / 1024;

Assert(blcksz_kb > 0);

sb_limit_kb = NBuffers / 8 * blcksz_kb;

return Min(sb_limit_kb, bufsize_kb);
}

The code seems to mean that the *maximum* is the lesser of 16GB and
shared_buffers / 8. You're saying it's the minimum.

2. I think you could get rid of the double "Buffer Access Stategy" in:

<glossterm linkend="glossary-buffer-access-strategy">Buffer
Access Strategy</glossterm>.
<literal>0</literal> will disable use of a <literal>Buffer
Access Strategy</literal>.
<literal>-1</literal> will set the size to a default of
<literal>256 KB</literal>. The maximum size is

how about:

<glossterm linkend="glossary-buffer-access-strategy">Buffer
Access Strategy</glossterm>.
A setting of <literal>0</literal> will allow the operation to use any
number of <varname>shared_buffers</varname>, whereas
<literal>-1</literal> will set the size to a default of
<literal>256 KB</literal>. The maximum size is

3. In the following snippet you can use <xref linkend="sql-vacuum"/>
or just <command>VACUUM</command>. There are examples of both in that
file. I don't have a preference as it which, but I think what you've
got isn't great.

<link linkend="sql-vacuum"><command>VACUUM</command></link> and
<link linkend="sql-analyze"><command>ANALYZE</command></link>

4. I wonder if there's a reason this needs to be written in the
overview of ANALYZE.

<command>ANALYZE</command> uses a
<glossterm linkend="glossary-buffer-access-strategy">Buffer Access
Strategy</glossterm>
when reading in the sample data. The number of buffers consumed for this can
be controlled by <xref linkend="guc-vacuum-buffer-usage-limit"/> or by using
the <option>BUFFER_USAGE_LIMIT</option> option.

I think it's fine just to mention it under BUFFER_USAGE_LIMIT. It just
does not seem fundamental enough to be worth being upfront about it.
The other things mentioned in that section don't seem related to
parameters, so there might be no better place for those to go. That's
not the case for what you're adding here.

5. I think I'd rather see the details spelt out here rather than
telling the readers to look at what VACUUM does:

Specifies the
<glossterm linkend="glossary-buffer-access-strategy">Buffer
Access Strategy</glossterm>
ring buffer size for <command>ANALYZE</command>. See the
<link linkend="sql-vacuum"><command>VACUUM</command></link> option with
the same name.

6. When I asked about restricting the valid values of
vacuum_buffer_usage_limit to -1 / 0 or 128 KB to 16GB, I didn't expect
you to code in the NBuffers / 8 check. We shouldn't chain
dependencies between GUCs like that. Imagine someone editing their
postgresql.conf after realising shared_buffers is too large for their
RAM, they reduce it and restart. The database fails to start because
vacuum_buffer_usage_limit is too large! Angry DBA?

Take what's already written about vacuum_failsafe_age as your guidance on this:

"The default is 1.6 billion transactions. Although users can set this
value anywhere from zero to 2.1 billion, VACUUM will silently adjust
the effective value to no less than 105% of
autovacuum_freeze_max_age."

Here we just document the silent capping. You can still claim the
valid range is 128KB to 16GB in the docs. You can mention the 1/8th of
shared buffers cap same as what's mentioned about "105%" above.

When I mentioned #4 and #10 in my review of the v9-0001 patch, I just
wanted to not surprise users who do vacuum_buffer_usage_limit = 64 and
magically get 128.

7. In ExecVacuum(), similar to #6 from above, it's also not great that
you're raising an ERROR based on if StrategyGetClampedBufsize() clamps
or not. If someone has a script that does:

VACUUM (BUFFER_USAGE_LIMIT '1 GB'); it might be annoying that it stops
working when someone adjusts shared buffers from 10GB to 6GB.

I really think the NBuffers / 8 clamping just should be done inside
GetAccessStrategyWithSize().

8. I think this ERROR in vacuum.c should mention that 0 is a valid value.

ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("buffer_usage_limit for a vacuum must be between %d KB and %d KB",
MIN_BAS_RING_SIZE_KB, MAX_BAS_RING_SIZE_KB)));

I doubt there's a need to mention -1 as that's the same as not
specifying BUFFER_USAGE_LIMIT.

9. The following might be worthy of a comment explaining the order of
precedence of how we choose the size:

if (params->ring_size == -1)
{
if (VacuumBufferUsageLimit == -1)
bstrategy = GetAccessStrategy(BAS_VACUUM);
else
bstrategy = GetAccessStrategyWithSize(BAS_VACUUM, VacuumBufferUsageLimit);
}
else
bstrategy = GetAccessStrategyWithSize(BAS_VACUUM, params->ring_size);

10. I wonder if you need to keep bufsize_limit_to_nbuffers(). It's
just used once and seems trivial enough just to write that code inside
GetAccessStrategyWithSize().

11. It's probably worth putting the valid range in the sample config:

#vacuum_buffer_usage_limit = -1 # size of vacuum and analyze buffer
access strategy ring.
# -1 to use default,
# 0 to disable vacuum buffer access strategy
# > 0 to specify size <-- here

12. Is bufmgr.h the right place for these?

/*
* Upper and lower hard limits for the Buffer Access Strategy ring size
* specified by the vacuum_buffer_usage_limit GUC and BUFFER_USAGE_LIMIT option
* to VACUUM and ANALYZE.
*/
#define MAX_BAS_RING_SIZE_KB (16 * 1024 * 1024)
#define MIN_BAS_RING_SIZE_KB 128

Your comment implies they're VACUUM / ANALYZE limits. If we want to
impose these limits to all access strategies then these seem like good
names and location, otherwise, I imagine miscadmin.h is the correct
place. If so, they'll likely want to be renamed to something more
VACUUM specific. I don't particularly have a preference. 128 -
1677216 seem like reasonable limits for any buffer access strategy.

13. I think check_vacuum_buffer_usage_limit() does not belong in
freelist.c. Maybe vacuum.c?

14. Not related to this patch, but why do we have half the vacuum
related GUCs in vacuum.c and the other half in globals.c? I see
vacuum_freeze_table_age is defined in vacuum.c but is also needed in
autovacuum.c, so that rules out the globals.c ones being for vacuum.c
and autovacuum.c. It seems a bit messy. I'm not really sure where
VacuumBufferUsageLimit should go now.

> Remaining TODOs:
> - tests
> - do something about config reload changing GUC

Shouldn't table_recheck_autovac() pfree/palloc a new strategy if the
size changes?

I'm not sure what the implications are with that and the other patch
you're working on to allow vacuum config changes mid-vacuum. We'll
need to be careful and not immediately break that if that gets
committed then this does or vice-versa.

David

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2023-04-05 03:15:57 Re: refactoring relation extension and BufferAlloc(), faster COPY
Previous Message Michael Paquier 2023-04-05 02:54:06 Re: POC: Lock updated tuples in tuple_update() and tuple_delete()