Re: Move bki file pre-processing from initdb to bootstrap

From: Peter Eisentraut <peter(at)eisentraut(dot)org>
To: Krishnakumar R <kksrcv001(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Cc: "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: Move bki file pre-processing from initdb to bootstrap
Date: 2023-09-19 10:18:37
Message-ID: 8252dea3-d0e7-b127-19c7-7c4a77229930@eisentraut.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 01.09.23 10:01, Krishnakumar R wrote:
> This patch moves the pre-processing for tokens in the bki file from
> initdb to bootstrap. With these changes the bki file will only be
> opened once in bootstrap and parsing will be done by the bootstrap
> parser.

I did some rough performance tests on this. I get about a 10%
improvement on initdb run time, so this appears to have merit.

I wonder whether we can reduce the number of symbols that we need this
treatment for.

For example, for NAMEDATALEN, SIZEOF_POINTER, ALIGNOF_POINTER,
FLOAT8PASSBYVAL, these are known at build time, so we could have
genbki.pl substitute them at build time.

The locale-related symbols (ENCODING, LC_COLLATE, etc.), I wonder
whether we can eliminate the need for them. Right now, these are only
used in the bki entry for the template1 database. How about initdb
creates template0 first, with hardcoded default encoding, collation,
etc., and then creates template1 from that, using the normal CREATE
DATABASE command with the appropriate options. Or initdb could just run
an UPDATE on pg_database to put the right settings in place.

I don't like this part so much, because it adds like 4 more places each
of these variables is mentioned, which increases the mental and testing
overhead for dealing with these features (which are an area of active
development).

In general, it would be good if this could be factored a bit more so
each variable doesn't have to be hardcoded in so many places.

Some more detailed comments on the code:

+ boot_yylval.str = pstrdup(yytext);
+ sprintf(boot_yylval.str, "%d", NAMEDATALEN);

This is weird. You are first assigning the text and then overwriting it
with the numeric value?

You can also use boot_yylval.ival for storing numbers.

+ if (bootp_null(ebootp, ebootp->username)) return
NULLVAL;

Add proper line breaks in the code.

+bool bootp_null(extra_bootstrap_params *e, char *s)

Add a comment what this function is supposed to do.

This function could be static.

+ while ((flag = getopt(argc, argv, "B:c:d:D:Fi:kr:X:-:")) != -1)

You should use an option letter that isn't already in use in one of the
other modes of "postgres". We try to keep those consistent.

New options should be added to the --help output (usage() in main.c).

+ elog(INFO, "Open bki file %s\n", bki_file);
+ boot_yyin = fopen(bki_file, "r");

Why is this needed? It already reads the bki file from stdin?

+ printfPQExpBuffer(&cmd, "\"%s\" --boot -X %d %s %s %s %s -i
%s=%s,%s=%s,%s=%s,"
+ "%s=%s,%s=%s,%s=%s,%s=%s,%s=%c",
+ backend_exec,
+ wal_segment_size_mb * (1024 * 1024),
+ boot_options, extra_options,
+ data_checksums ? "-k" : "",
+ debug ? "-d 5" : "",

This appears to undo some of the changes done in cccdbc5d95.

+#define BOOT_LC_COLLATE "lc_collate"
+#define BOOT_LC_CTYPE "lc_ctype"
+#define BOOT_ICU_LOCALE "icu_locale"

etc. This doesn't look particularly useful. You can just use the
strings directly.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2023-09-19 10:18:49 Re: remaining sql/json patches
Previous Message Pierre Ducroquet 2023-09-19 10:15:55 Re: Improvements in pg_dump/pg_restore toc format and performances