Re: Cutting initdb's runtime (Perl question embedded)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Andreas Karlsson <andreas(at)proxel(dot)se>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Cutting initdb's runtime (Perl question embedded)
Date: 2017-04-13 18:05:43
Message-ID: 9244.1492106743@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andres Freund <andres(at)anarazel(dot)de> writes:
> On 2017-04-13 12:56:14 -0400, Tom Lane wrote:
>> Andres Freund <andres(at)anarazel(dot)de> writes:
>>> Cool. I wonder if we also should remove AtEOXact_CatCache()'s
>>> cross-checks - the resowner replacement has been in place for a while,
>>> and seems robust enough. They're now the biggest user of time.

>> Hm, biggest user of time in what workload? I've not noticed that
>> function particularly.

> Just initdb. I presume it's because the catcaches will frequently be
> relatively big there.

Hm. That ties into something I was looking at yesterday. The only
reason that function is called so much is that bootstrap mode runs a
separate transaction for *each line of the bki file* (cf do_start,
do_end in bootparse.y). Which seems pretty silly. I experimented
with collapsing all the transactions for consecutive DATA lines into
one transaction, but couldn't immediately make it work due to memory
management issues. I didn't try collapsing the entire run into a
single transaction, but maybe that would actually be easier, though
no doubt more wasteful of memory.

>> I agree that it doesn't seem like we need to spend a lot of time
>> cross-checking there, though. Maybe keep the code but #ifdef it
>> under some nondefault debugging symbol.

> Hm, if we want to keep it, maybe tie it to CLOBBER_CACHE_ALWAYS or such,
> so it gets compiled at least sometimes? Not a great fit, but ...

Don't like that, because CCA is by definition not the normal cache
behavior. It would make a bit of sense to tie it to CACHEDEBUG,
but as you say, it'd never get tested normally if we do that.

On the whole, though, we may be looking at diminishing returns here.
I just did some "perf" measurement of the overall "initdb" cycle,
and what I'm seeing suggests that bootstrap mode as such is now a
pretty small fraction of the overall cycle:

+ 51.07% 0.01% 28 postgres postgres [.] PostgresMain #
...
+ 13.52% 0.00% 0 postgres postgres [.] AuxiliaryProcessMain #

That says that the post-bootstrap steps are now the bulk of the time,
which agrees with naked-eye observation.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2017-04-13 19:02:27 Re: pg_upgrade vs extension upgrades
Previous Message Pavel Stehule 2017-04-13 18:03:41 Re: Undefined psql variables