Re: initdb / bootstrap design

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: initdb / bootstrap design
Date: 2022-02-20 21:44:39
Message-ID: 20220220214439.bhc35hhbaub6dush@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2022-02-19 20:46:26 -0500, Tom Lane wrote:
> I tried it like that (full patch attached) and the results are intensely
> disappointing. On my Mac laptop, the time needed for 50 iterations of
> initdb drops from 16.8 sec to 16.75 sec.

Hm. I'd hoped for at least a little bit bigger win. But I think it enables
more, see below:

> Not sure that this is worth pursuing any further.

I experimented with moving all the bootstrapping into --boot mode and got it
working. Albeit definitely with a few hacks (more below).

While I had hoped for a bit more of a win, it's IMO a nice improvement.
Executing 10 initdb -N --wal-segsize 1 in a loop:

HEAD:

assert:
8.06user 1.17system 0:09.25elapsed 99%CPU (0avgtext+0avgdata 91724maxresident)k
0inputs+549280outputs (40major+99824minor)pagefaults 0swaps

opt:
2.89user 0.99system 0:04.81elapsed 80%CPU (0avgtext+0avgdata 88864maxresident)k
0inputs+549280outputs (40major+99792minor)pagefaults 0swaps

default to lz4:

assert:
7.61user 1.03system 0:08.69elapsed 99%CPU (0avgtext+0avgdata 91508maxresident)k
0inputs+546400outputs (42major+99551minor)pagefaults 0swaps

opt:
2.55user 0.94system 0:03.49elapsed 99%CPU (0avgtext+0avgdata 88816maxresident)k
0inputs+546400outputs (40major+99551minor)pagefaults 0swaps

bootstrap replace:

assert:
7.42user 1.00system 0:08.52elapsed 98%CPU (0avgtext+0avgdata 91656maxresident)k
0inputs+546400outputs (40major+97737minor)pagefaults 0swaps

opt:
2.49user 0.98system 0:03.49elapsed 99%CPU (0avgtext+0avgdata 88700maxresident)k
0inputs+546400outputs (40major+97728minor)pagefaults 0swaps

everything in bootstrap:

assert:
6.31user 0.94system 0:07.35elapsed 98%CPU (0avgtext+0avgdata 97812maxresident)k
0inputs+547360outputs (30major+88617minor)pagefaults 0swaps

opt:
2.42user 0.85system 0:03.28elapsed 99%CPU (0avgtext+0avgdata 94572maxresident)k
0inputs+547360outputs (30major+83712minor)pagefaults 0swaps

optimize WAL in bootstrap:
assert:
6.26user 0.96system 0:07.29elapsed 99%CPU (0avgtext+0avgdata 97844maxresident)k
0inputs+547360outputs (30major+88586minor)pagefaults 0swaps

opt:
2.43user 0.80system 0:03.24elapsed 99%CPU (0avgtext+0avgdata 94436maxresident)k
0inputs+547360outputs (30major+83664minor)pagefaults 0swaps

remote isatty in bootstrap:

assert:
6.15user 0.83system 0:06.99elapsed 99%CPU (0avgtext+0avgdata 97832maxresident)k
0inputs+465120outputs (30major+88559minor)pagefaults 0swaps

opt:
2.28user 0.85system 0:03.14elapsed 99%CPU (0avgtext+0avgdata 94604maxresident)k
0inputs+465120outputs (30major+83728minor)pagefaults 0swaps

That's IMO not bad.

On windows I see a higher gains, which makes sense, because filesystem IO is
slower. Freebsd as well, but the variance is oddly high, so I might be doing
something wrong.

The main reason I like this however isn't the speedup itself, but that after
this initdb doesn't depend on single user mode at all anymore.

About the prototype:

- Most of the bootstrap SQL is executed from bootstrap.c itself. But some
still comes from the client. E.g. password, a few information_schema
details and the database / authid changes.

- To execute the sql I mostly used extension.c's
read_whole_file()/execute_sql_string(). But VACUUM, CREATE DATABASE require
all the transactional hacks in portal.c etc. So I wrapped
exec_simple_query() for that phase.

Might be better to just call vacuum.c / database.c directly.

- for indexed relcache access to work the phase of
RelationCacheInitializePhase3() that's initially skipped needs to be
executed. I hacked that up by adding a RelationCacheInitializePhase3b() that
bootstrap.c can call, but that's obviously too ugly to live.

- InvalidateSystemCaches() is needed after bki processing. Otherwise I see an
"row is too big:" error. Didn't investigate yet.

- I definitely removed some validation that we'd probably want. But that seems
something to care about later...

- 0004 prevents a fair bit of WAL from being written. While XLogInsert did
some of that, it didn't block FPIs, which obviously are bulky. This reduces
WAL from ~5MB to ~100kB.

There's quite a bit of further speedup potential:

- One bottleneck, particularly in optimized mode, is the handling of huge node
trees for views. strToNode() and nodeRead() are > 10% alone

- Enabling index access sometime during the postgres.bki processing would make
invalidation handling for subsequent indexes faster. Or maybe we can disable
a few more invalidations. Inval processing is >10%

- more than 10% (assert) / 7% (optimized) is spent in
compute_scalar_stats()->qsort_arg(). Something seems off with that to me.

Completely crazy?

Greetings,

Andres Freund

Attachment Content-Type Size
v1-0001-Set-default_toast_compression-lz4-if-available.patch text/x-diff 2.3 KB
v1-0002-initdb-move-token-replacing-in-postgres.bki-to-ba.patch text/x-diff 8.1 KB
v1-0003-initdb-perform-everything-during-boot-mostly-in-b.patch text/x-diff 43.6 KB
v1-0004-initdb-Optimize-WAL-writing-during-initdb.patch text/x-diff 5.3 KB
v1-0005-initdb-call-isatty-only-once-in-bootparse.y.patch text/x-diff 1.0 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2022-02-20 22:15:37 Re: do only critical work during single-user vacuum?
Previous Message Justin Pryzby 2022-02-20 20:57:33 Re: Adding CI to our tree (ccache)