From: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
---|---|
To: | Melanie Plageman <melanieplageman(at)gmail(dot)com> |
Cc: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Geoghegan <pg(at)bowt(dot)ie>, pgsql-hackers(at)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com> |
Subject: | Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode |
Date: | 2023-03-18 18:30:38 |
Message-ID: | ZBYDTrD1kyGg+HkS@telsasoft.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> Subject: [PATCH v4 3/3] add vacuum[db] option to specify ring_size and guc
> + Specifies the ring buffer size to be used for a given invocation of
> + <command>VACUUM</command> or instance of autovacuum. This size is
> + converted to a number of shared buffers which will be reused as part of
I'd say "specifies the size of shared_buffers to be reused as .."
> + a <literal>Buffer Access Strategy</literal>. <literal>0</literal> will
> + disable use of a <literal>Buffer Access Strategy</literal>.
> + <literal>-1</literal> will set the size to a default of <literal>256
> + kB</literal>. The maximum ring buffer size is <literal>16 GB</literal>.
> + Though you may set <varname>vacuum_buffer_usage_limit</varname> below
> + <literal>128 kB</literal>, it will be clamped to <literal>128
> + kB</literal> at runtime. The default value is <literal>-1</literal>.
> + This parameter can be set at any time.
GUC docs usually also say something like
"If this value is specified without units, it is taken as .."
> + is used to calculate a number of shared buffers which will be reused as
*the* number?
> + <command>VACUUM</command>. The analyze stage and parallel vacuum workers
> + do not use this size.
I think what you mean is that vacuum's heap scan stage uses the
strategy, but the index scan/cleanup phases doesn't?
> + The size in kB of the ring buffer used for vacuuming. This size is used
> + to calculate a number of shared buffers which will be reused as part of
*the* number
> +++ b/doc/src/sgml/ref/vacuumdb.sgml
The docs here duplicate the sql-vacuum docs. It seems better to refer
to the vacuum page for details, like --parallel does.
Unrelated: it would be nice if the client-side options were documented
separately from the server-side options. Especially due to --jobs and
--parallel.
> + if (!parse_int(vac_buffer_size, &result, GUC_UNIT_KB, NULL))
> + {
> + ereport(ERROR,
> + (errcode(ERRCODE_SYNTAX_ERROR),
> + errmsg("buffer_usage_limit for a vacuum must be between -1 and %d. %s is invalid.",
> + MAX_BAS_RING_SIZE_KB, vac_buffer_size),
> + parser_errposition(pstate, opt->location)));
> + }
> +
> + /* check for out-of-bounds */
> + if (result < -1 || result > MAX_BAS_RING_SIZE_KB)
> + {
> + ereport(ERROR,
> + (errcode(ERRCODE_SYNTAX_ERROR),
> + errmsg("buffer_usage_limit for a vacuum must be between -1 and %d",
> + MAX_BAS_RING_SIZE_KB),
> + parser_errposition(pstate, opt->location)));
> + }
I think these checks could be collapsed into a single ereport().
if !parse_int() || (result < -1 || result > MAX_BAS_RINGSIZE_KB):
ereport(ERROR,
errcode(ERRCODE_SYNTAX_ERROR),
errmsg("buffer_usage_limit for a vacuum must be an integer between -1 and %d",
MAX_BAS_RING_SIZE_KB),
There was a recent, similar, and unrelated suggestion here:
https://www.postgresql.org/message-id/20230314.135859.260879647537075548.horikyota.ntt%40gmail.com
> +#vacuum_buffer_usage_limit = -1 # size of vacuum buffer access strategy ring.
> + # -1 to use default,
> + # 0 to disable vacuum buffer access strategy and use shared buffers
I think it's confusing to say "and use shared buffers", since
"strategies" also use shared_buffers. It seems better to remove those 4
words.
> @@ -550,6 +563,13 @@ vacuum_one_database(ConnParams *cparams,
> pg_fatal("cannot use the \"%s\" option on server versions older than PostgreSQL %s",
> "--parallel", "13");
>
> + // TODO: this is a problem: if the user specifies this option with -1 in a
> + // version before 16, it will not produce an error message. it also won't
> + // do anything, but that still doesn't seem right.
Actually, that seems fine to me. If someone installs v16 vacuumdb, they
can run it against old servers and specify the option as -1 without it
failing with an error. I don't know if anyone will find that useful,
but it doesn't seem unreasonable.
I still think adding something to the glossary would be good.
Buffer Access Strategy:
A circular/ring buffer used for reading or writing data pages from/to
the operating system. Ring buffers are used for sequential scans of
large tables, VACUUM, COPY FROM, CREATE TABLE AS SELECT, ALTER TABLE,
and CLUSTER. By using only a limited portion of >shared_buffers<, the
ring buffer avoids avoids evicting large amounts of data whenever a
backend performs bulk I/O operations. Use of a ring buffer also forces
the backend to write out its own dirty pages, rather than leaving them
behind to be cleaned up by other backends.
If there's a larger section added than a glossary entry, the text could
be promoted from src/backend/storage/buffer/README to doc/.
--
Justin
From | Date | Subject | |
---|---|---|---|
Next Message | Joseph Koshakow | 2023-03-18 18:48:33 | Re: Infinite Interval |
Previous Message | Tom Lane | 2023-03-18 18:18:04 | Re: generate_series for timestamptz and time zone problem |