Re: Initdb-time block size specification

From: David Christensen <david(dot)christensen(at)crunchydata(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Stephen Frost <sfrost(at)snowman(dot)net>
Subject: Re: Initdb-time block size specification
Date: 2023-06-30 19:09:55
Message-ID: CAOxo6XKuPJ3opoe2NWMQjjF8KD0aFtw+Vj1iRVow5aB67fp7Fw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jun 30, 2023 at 1:14 PM Tomas Vondra
<tomas(dot)vondra(at)enterprisedb(dot)com> wrote:

> Do we really want to prefix the option with CLUSTER_? That seems to just
> add a lot of noise into the patch, and I don't see much value in this
> rename. I'd prefer keeping BLCKSZ and tweak just the couple places that
> need "default" to use BLCKSZ_DEFAULT or something like that.
>
> But more importantly, I'd say we use CAPITALIZED_NAMES for compile-time
> values, so after making this initdb-time parameter we should abandon
> that (just like fc49e24fa69a did for segment sizes). Perhaps something
> like cluste_block_size would work ... (yes, that touches all the places
> too).

Yes, I can see that being an equivalent change; thanks for the pointer
there. Definitely the "cluster_block_size" could be an approach,
though since it's just currently a #define for GetBlockSize(), maybe
we just replace with the equivalent instead. I was mainly trying to
make it something that was conceptually similar and easy to reason
about without getting bogged down in the details locally, but can see
that ALL_CAPS does have a specific meaning. Also eliminating the
BLCKSZ symbol meant it was easier to catch anything which depended on
that value. If we wanted to keep BLCKSZ, I'd be okay with that at
this point vs the CLUSTER_BLOCK_SIZE approach, could help to make the
patch smaller at this point.

> > Initial (basic) performance testing shows only minor changes with the pgbench -S
> > benchmark, though this is obviously an area that will need considerable
> > testing/verification across multiple workloads.
> >
>
> I wonder how to best evaluate/benchmark this. AFAICS what we want to
> measure is the extra cost of making the values dynamic (which means the
> compiler can't just optimize them out). I'd say a "pgbench -S" seems
> like a good test.

Yep, I tested 100 runs apiece with pgbench -S at scale factor 100,
default settings for optimized builds of the same base commit with and
without the patch; saw a reduction of TPS around 1% in that case, but
I do think we'd want to look at different workloads; I presume the
largest impact would be seen when it's all in shared_buffers and no IO
is required.

Thanks,

David

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2023-06-30 19:39:42 Re: Initdb-time block size specification
Previous Message Karina Litskevich 2023-06-30 18:48:28 Re: Avoid unncessary always true test (src/backend/storage/buffer/bufmgr.c)