Skip site navigation (1) Skip section navigation (2)

Cutting initdb's runtime (Perl question embedded)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: Cutting initdb's runtime (Perl question embedded)
Date: 2017-04-12 14:12:47
Message-ID: 30896.1492006367@sss.pgh.pa.us (view raw, whole thread or download thread mbox)
Thread:
Lists: pgsql-hackers
Andres mentioned, and I've confirmed locally, that a large chunk of
initdb's runtime goes into regprocin's brute-force lookups of function
OIDs from function names.  The recent discussion about cutting TAP test
time prompted me to look into that question again.  We had had some
grand plans for getting genbki.pl to perform the name-to-OID conversion
as part of a big rewrite, but since that project is showing few signs
of life, I'm thinking that a more localized performance fix would be
a good thing to look into.  There seem to be a couple of plausible
routes to a fix:

1. The best thing would still be to make genbki.pl do the conversion,
and write numeric OIDs into postgres.bki.  The core stumbling block
here seems to be that for most catalogs, Catalog.pm and genbki.pl
never really break down a DATA line into fields --- and we certainly
have got to do that, if we're going to replace the values of regproc
fields.  The places that do need to do that approximate it like this:

	# To construct fmgroids.h and fmgrtab.c, we need to inspect some
	# of the individual data fields.  Just splitting on whitespace
	# won't work, because some quoted fields might contain internal
	# whitespace.  We handle this by folding them all to a simple
	# "xxx". Fortunately, this script doesn't need to look at any
	# fields that might need quoting, so this simple hack is
	# sufficient.
	$row->{bki_values} =~ s/"[^"]*"/"xxx"/g;
	@{$row}{(at)attnames} = split /\s+/, $row->{bki_values};

We would need a bullet-proof, non-hack, preferably not too slow way to
split DATA lines into fields properly.  I'm one of the world's worst
Perl programmers, but surely there's a way?

2. Alternatively, we could teach bootstrap mode to build a hashtable
mapping proname to OID while it reads pg_proc data from postgres.bki,
and then make regprocin's bootstrap path consult the hashtable instead
of looking directly at the pg_proc file.  That I'm quite sure is do-able,
but it seems like it's leaving money on the table compared to doing
the transformation earlier.

Thoughts?

			regards, tom lane


Responses

pgsql-hackers by date

Next:From: Tom LaneDate: 2017-04-12 14:24:30
Subject: Re: index-only count(*) for indexes supporting bitmap scans
Previous:From: Alexander KuzmenkovDate: 2017-04-12 14:01:39
Subject: Re: index-only count(*) for indexes supporting bitmap scans

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group