Re: WIP: a way forward on bootstrap data

From: John Naylor <jcnaylor(at)gmail(dot)com>
To: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, David Fetter <david(at)fetter(dot)org>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: a way forward on bootstrap data
Date: 2018-01-13 10:43:09
Message-ID: CAJVSVGXKsiwMVbtx-nGqPeFzsCEWmFs5wFmepEawdzAyWhLO-Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 1/13/18, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> wrote:

> I'm afraid a key value system would invite writing the attributes in
> random order and create a mess over time.

A developer can certainly write them in random order, and it will
still work. However, in patch 0002 I have a script to enforce a
standard appearance. Of course, for it to work, you have to run it. I
describe it, if rather tersely, in the README changes in patch 0008.
Since several people have raised this concern, I will go into a bit
more depth here. Perhaps I should reuse some of this language for the
README to improve it.

src/include/catalog/rewrite_dat.pl knows where to find the schema of
each catalog, namely the pg_*.h header, accessed via ParseHeader() in
Catalog.pm. It writes key/value pairs in the order found in the
schema:

{ key_1 => 'value_1', key_2 => 'value_2', ..., key_n => 'value_n' }

The script also has an array of four hard-coded metadata fields: oid,
oid_symbol, descr, and shdescr. If any of these are present, they will
go on their own line first, in the order given:

{ oid => 9999, oid_symbol => 'FOO_OID', descr => 'comment on foo',
key_1 => 'value_1', key_2 => 'value_2', ..., key_n => 'value_n' }

> I don't think I like this. I know pg_proc.h is a pain to manage, but at
> least right now it's approachable programmatically. I recently proposed
> to patch to replace the columns proisagg and proiswindow with a combined
> column prokind. I could easily write a small Perl script to make that
> change in pg_proc.h, because the format is easy to parse and has one
> line per entry. With this new format, that approach would no longer
> work, and I don't know what would replace it.

I've attached four diffs/patches to walk through how you would replace
the columns proisagg and proiswindow with a combined column prokind.

Patch 01: Add new prokind column to pg_proc.h, with a default of 'n'.
In many cases, this is all you would have to do, as far as
bootstrapping is concerned.

Diff 02: This is a one-off script diffed against rewrite_dat.pl. In
rewrite_dat.pl, I have a section with this comment, and this is where
I put the one-off code:

# Note: This is also a convenient place to do one-off
# bulk-editing.

(I haven't documented this with explicit examples, so I'll have to remedy that)

You would run it like this:

cd src/include/catalog
perl -I ../../backend/catalog/ rewrite_dat_with_prokind.pl pg_proc.dat

While reading pg_proc.dat, the default value for prokind is added
automatically. We inspect proisagg and proiswindow, and change prokind
accordingly. pg_proc.dat now has all three columns, prokind, proisagg,
and proiswindow.

Patch 03: Remove old columns from pg_proc.h

Now we run the standard rewrite:

perl -I ../../backend/catalog/ rewrite_dat.pl pg_proc.dat

Any values not found in the schema will simply not be written to
pg_proc.dat, so the old columns are now gone.

The result is found in patch 04.
--

Note: You could theoretically also load the source data into tables,
do the updates with SQL, and dump back out again. I made some progress
with this method, but it's not complete. I think the load and dump
steps add too much complexity for most use cases, but it's a
possibility.

-John Naylor

Attachment Content-Type Size
02_prokind_example_populate_new_column.diff text/plain 642 bytes
01_prokind_example_add_new_column.patch text/x-patch 492 bytes
03_prokind_example_remove_old_columns.patch text/x-patch 556 bytes
04_prokind_data_end_result.patch text/x-patch 52.7 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2018-01-13 11:47:29 Re: WIP: a way forward on bootstrap data
Previous Message Marina Polyakova 2018-01-13 09:40:33 Re: master make check fails on Solaris 10