Re: Initdb-time block size specification

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: David Christensen <david(dot)christensen(at)crunchydata(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>
Subject: Re: Initdb-time block size specification
Date: 2023-06-30 18:14:18
Message-ID: d1b4d578-e0f8-49f3-714d-8467d8d25cf6@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 6/30/23 19:35, David Christensen wrote:
> Hi,
>
> As discussed somewhat at PgCon, enclosed is v1 of a patch to provide
> variable block sizes; basically instead of BLCKSZ being a compile-time
> constant, a single set of binaries can support all of the block sizes
> Pg can support, using the value stored in pg_control as the basis.
> (Possible future plans would be to make this something even more
> dynamic, such as configured per tablespace, but this is out of scope;
> this just sets up the infrastructure for this.)
>
> Whereas we had traditionally used BLCKSZ to indicate the compile-time selected
> block size, this commit adjusted things so the cluster block size can be
> selected at initdb time.
>
> In order to code for this, we introduce a few new defines:
>
> - CLUSTER_BLOCK_SIZE is the blocksize for this cluster itself. This is not
> valid until BlockSizeInit() has been called in the given backend, which we do as
> early as possible by parsing the ControlFile and using the blcksz field.
>
> - MIN_BLOCK_SIZE and MAX_BLOCK_SIZE are the limits for the selectable block
> size. It is required that CLUSTER_BLOCK_SIZE is a power of 2 between these two
> constants.
>
> - DEFAULT_BLOCK_SIZE is the moral equivalent of BLCKSZ; it is the built-in
> default value. This is used in a few places that just needed a buffer of an
> arbitrary size, but the dynamic value CLUSTER_BLOCK_SIZE should almost always be
> used instead.
>
> - CLUSTER_RELSEG_SIZE is used instead of RELSEG_SIZE, since internally we are
> storing the segment size in terms of number of blocks. RELSEG_SIZE is still
> kept, but is used in terms of the number of blocks of DEFAULT_BLOCK_SIZE;
> CLUSTER_RELSEG_SIZE scales appropriately (and is the only thing used internally)
> to keep the same target total segment size regardless of block size.
>

Do we really want to prefix the option with CLUSTER_? That seems to just
add a lot of noise into the patch, and I don't see much value in this
rename. I'd prefer keeping BLCKSZ and tweak just the couple places that
need "default" to use BLCKSZ_DEFAULT or something like that.

But more importantly, I'd say we use CAPITALIZED_NAMES for compile-time
values, so after making this initdb-time parameter we should abandon
that (just like fc49e24fa69a did for segment sizes). Perhaps something
like cluste_block_size would work ... (yes, that touches all the places
too).

> This patch uses a precalculated table to store the block size itself, as well as
> additional derived values that have traditionally been compile-time
> constants (example: MaxHeapTuplesPerPage). The traditional macro names are kept
> so code that doesn't care about it should not need to change, however the
> definition of these has changed (see the CalcXXX() routines in blocksize.h for
> details).
>
> A new function, BlockSizeInit() populates the appropriate values based on the
> target block size. This should be called as early as possible in any code that
> utilizes block sizes. This patch adds this in the appropriate place on the
> handful of src/bin/ programs that used BLCKSZ, so this caveat mainly impacts new
> code.
>
> Code which had previously used BLCKZ should likely be able to get away with
> changing these instances to CLUSTER_BLOCK_SIZE, unless you're using a structure
> allocated on the stack. In these cases, the compiler will complain about
> dynamic structure. The solution is to utilize an expression with MAX_BLOCK_SIZE
> instead of BLCKSZ, ensuring enough stack space is allocated for the maximum
> size. This also does require using CLUSTER_BLOCK_SIZE or an expression based on
> it when actually using this structure, so in practice more stack space may be
> allocated then used in principal; as long as there is plenty of stack this
> should have no specific impacts on code.
>
> Initial (basic) performance testing shows only minor changes with the pgbench -S
> benchmark, though this is obviously an area that will need considerable
> testing/verification across multiple workloads.
>

I wonder how to best evaluate/benchmark this. AFAICS what we want to
measure is the extra cost of making the values dynamic (which means the
compiler can't just optimize them out). I'd say a "pgbench -S" seems
like a good test.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2023-06-30 18:25:57 Re: Move un-parenthesized syntax docs to "compatibility" for few SQL commands
Previous Message Cary Huang 2023-06-30 18:12:03 Re: sslinfo extension - add notbefore and notafter timestamps