Re: Bootstrap DATA is a pita

From: Mark Dilger <hornschnorter(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Caleb Welton <cwelton(at)pivotal(dot)io>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Bootstrap DATA is a pita
Date: 2015-12-11 22:24:40
Message-ID: 5A73C1D1-6129-481D-AEAD-50EF8FB73572@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


> On Dec 11, 2015, at 1:46 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
>> Crazy idea: we could just have a CSV file which can be loaded into a
>> table for mass changes using regular DDL commands, then dumped back from
>> there into the file. We already know how to do these things, using
>> \copy etc. Since CSV uses one line per entry, there would be no merge
>> problems either (or rather: all merge problems would become conflicts,
>> which is what we want.)
>
> That's an interesting proposal. It would mean that the catalog files
> stay at more or less their current semantic level (direct representations
> of bootstrap catalog contents), but it does sound like a more attractive
> way to perform complex edits than writing Emacs macros ;-).

I would be happy to work on this, if there is much chance of the community
accepting a patch. Do you think replacing the numeric Oids for functions,
operators, opclasses and such in the source files with their names would
be ok, with the SQL converting those to Oids in the output? My eyes have
gotten tired more than once trying to read head files in src/include/catalog
looking for mistakes in what largely amounts to a big table of numbers.

For example, in pg_amop.h:

/* default operators int2 */
DATA(insert ( 1976 21 21 1 s 95 403 0 ));
DATA(insert ( 1976 21 21 2 s 522 403 0 ));
DATA(insert ( 1976 21 21 3 s 94 403 0 ));
DATA(insert ( 1976 21 21 4 s 524 403 0 ));
DATA(insert ( 1976 21 21 5 s 520 403 0 ));

Would become something like:

amopfamily amoplefttype amoprighttype amopstrategy amoppurpose amopopr amopmethod amopsortfamily
integer_ops int2 int2 1 search "<" btree 0
integer_ops int2 int2 2 search "<=" btree 0
integer_ops int2 int2 3 search "=" btree 0
integer_ops int2 int2 4 search ">=" btree 0
integer_ops int2 int2 5 search ">" btree 0

Note that I prefer to use tabs and a headerline, as the tabstop can be set to
line them up nicely, and the headerline allows you to see which column is
which, and what it is for. Csv is always harder for me to use that way, though
maybe that is just a matter of which editor i use. (vim)

And yes, I'd need to allow the HEADER option for copying tab delimited
files, since it is currently only allowed for csv, I believe.

mark

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Corey Huinker 2015-12-11 22:26:41 Re: Should TIDs be typbyval = FLOAT8PASSBYVAL to speed up CREATE INDEX CONCURRENTLY?
Previous Message Alvaro Herrera 2015-12-11 21:47:57 Re: REASSIGN OWNED doesn't know how to deal with USER MAPPINGs