Re: Initdb-time block size specification

From: David Christensen <david(dot)christensen(at)crunchydata(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>
Subject: Re: Initdb-time block size specification
Date: 2023-08-31 01:50:53
Message-ID: CAOxo6X+yB0XKRJUXM1q1ZoP58HH6=Fhup95c7gFTYn__454Bnw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi, enclosed is v2 of the variable blocksize patch. This series is
based atop 9b581c5341.

Preparation phase:

0001 - add utility script for retokenizing all necessary scripts.
This is mainly for my own use in generating 0003, which is a simple
rename/preparation patch to change all symbols from their UPPER_CASE
to lower_case form, with several exceptions in renames.
0002 - add script to harness 0001 and apply to the relevant files in the repo
0003 - capture the effects of 0002 on the repo

The other patches in this series are as follows:

0004 - the "main" variable blocksize patch where the bulk of the code
changes take place - see comments here
0005 - utility functions for fast div/mod operations; basically
montgomery multiplication
0006 - use fastdiv code in the visiblity map, the main place where
this change is required
0007 - (optional) add/use libdivide for division which is license
compatible with other headers we bundle
0008 - (optional) tweaks to libdivide to make compiler/CI happy

I have also replaced multiple instances of division or multiplication
of BLOCKSZ with bitshift operations based on the number of bits in the
underlying blocksize.

The current approach for this is to replace any affected constant with
an inline switch statement based on an enum for the blocksize and the
compile-time calculation for that version. In practice with -O2 this
generates a simple lookup table inline in the assembly with the costs
for calculating paid at compile time.

The visibility map was the main hot path which was affected by the
switch from compile-time sizes with the previous version of this
patch. With the switch to a modified approach in 0005/0006 this issue
has been rectified in our testing.

I have tested a few workloads with this modified patch and have seen
positive results compared to v1. I look forward to additional
review/testing/feedback.

Thanks,

David

Attachment Content-Type Size
v2-0001-Add-tool-to-retokenize-for-variable-blocksize.patch application/octet-stream 1.8 KB
v2-0002-Add-wrapper-for-tokenizing-the-whole-repo.patch application/octet-stream 1.2 KB
v2-0005-Add-support-for-fast-non-division-based-div-mod-a.patch application/octet-stream 1.8 KB
v2-0003-Capture-rename-of-symbols.patch application/octet-stream 264.0 KB
v2-0004-Introduce-initdb-selectable-block-sizes.patch application/octet-stream 171.4 KB
v2-0008-Add-pragmas-to-libdivide-to-make-header-check-hap.patch application/octet-stream 730 bytes
v2-0006-Use-fastdiv-code-in-visibility-map.patch application/octet-stream 7.1 KB
v2-0007-Use-libdivide-for-fast-division.patch application/octet-stream 130.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2023-08-31 02:32:43 Re: lockup in parallel hash join on dikkop (freebsd 14.0-current)
Previous Message Thomas Munro 2023-08-31 01:29:37 Re: Query execution in Perl TAP tests needs work