Re: Bootstrap DATA is a pita

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Bootstrap DATA is a pita
Date: 2015-02-21 16:43:09
Message-ID: 20150221164309.GA2037@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2015-02-21 11:34:09 -0500, Tom Lane wrote:
> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> > On 2015-02-20 22:19:54 -0500, Peter Eisentraut wrote:
> >> On 2/20/15 8:46 PM, Josh Berkus wrote:
> >>> Or what about just doing CSV?
>
> >> I don't think that would actually address the problems. It would just
> >> be the same format as now with different delimiters.
>
> > Yea, we need hierarchies and named keys.
>
> Yeah. One thought though is that I don't think we need the "data" layer
> in your proposal; that is, I'd flatten the representation to something
> more like
>
> {
> oid => 2249,
> oiddefine => 'CSTRINGOID',
> typname => 'cstring',
> typlen => -2,
> typbyval => 1,
> ...
> }

I don't really like that - then stuff like oid, description, comment (?)
have to not conflict with any catalog columns. I think it's easier to
have them separate.

> This will be easier to edit, either manually or programmatically I think.
> The code that turns it into a .bki file will need to know the exact set
> of columns in each system catalog, but it would have had to know that
> anyway I believe, if you're expecting it to insert default values.

There'll need to be some awareness of columns, sure. But I think
programatically editing the values will still be simpler if you don't
need to discern whether a key is a column or some genbki specific value.

> Ideally the column defaults could come from BKI_ macros in the catalog/*.h
> files; it would be good if we could keep those files as the One Source of
> Truth for catalog schema info, even as we split out the initial data.

Hm, yea.

One thing I was considering was to do the regtype and regproc lookups
directly in the tool. That'd have two advantages: 1) it'd make it
possible to refer to typenames in pg_proc, 2) It'd be much faster. Right
now most of initdb's time is doing syscache lookups during bootstrap,
because it can't use indexes... A simple hash lookup during bki
generation could lead to quite measurable savings during lookup.

We could then even rip the bootstrap code out of regtypein/regprocin...

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2015-02-21 16:43:32 Re: Bootstrap DATA is a pita
Previous Message Tom Lane 2015-02-21 16:34:09 Re: Bootstrap DATA is a pita