Re: WIP: a way forward on bootstrap data

From: John Naylor <jcnaylor(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, David Fetter <david(at)fetter(dot)org>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: a way forward on bootstrap data
Date: 2018-01-14 17:55:26
Message-ID: CAJVSVGXURbQyfhKfj3AUrPOM+MOrp-4iMUVjqF2w_mmX_PV2VQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 1/14/18, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I'm not sure about the decision to move all the OID macros into
> one file; that seems like namespace pollution.
<snip>
> The design I'd kind of imagined was one generated file of #define's
> per pg_*.h file, not just one giant one.

First, thanks for the comments. I will start incorporating them in a
few days to give others the chance to offer theirs.

I'm inclined to agree about namespace pollution. One stumbling block
is the makefile changes to allow OID symbols to be visible to
non-backend code. Assuming I took the correct approach for the single
file case, adapting that to multiple files would require some
rethinking.

> It's especially
> odd that you did that but did not consolidate fmgroids.h with
> the macros for symbols from other catalogs.

Point taken.

> It would be really nice, also, if the attribute number macros
> (Natts_pg_proc, Anum_pg_proc_proname, etc) could be autogenerated.
> Manually renumbering those is one of the bigger pains in the rear
> when adding catalog columns. It was less of a pain than adjusting
> the DATA lines of course, so I never figured it was worth doing
> something about in isolation --- but with this infrastructure in
> place, that's manual work we shouldn't have to do anymore.

Searching the archives for discussion of Anum_* constants [1], I
prefer your one-time suggestion to use enums instead. I'd do that, and
then have Catalog.pm throw an error if Natts_* doesn't match the
number of columns. That's outside the scope of this patch, however.

> Another thing that I'd sort of hoped might happen from this patchset
> is to cure the problem of keeping some catalog headers safe for
> client-side inclusion, because some clients want the OID value macros
> and/or macros for column values (eg PROVOLATILE_IMMUTABLE), so they
> currently have to #include those headers or else hard-code the values.
> We've worked around that to date with ad-hoc solutions like splitting
> function declarations out to pg_*_fn.h files, but I never liked that
> much. With the OID value macros being moved out to separate generated
> file(s), there's now a possibility that we could fix this once and for all
> by making client-side code include those file(s) not pg_type.h et al
> themselves. But we'd need a way to put the column-value macros into
> those files too; maybe that's too messy to make it practical.

To make sure I understand you correctly:
Currently we have

pg_type.h oid symbols, column symbols
pg_type_fn.h function decls

And assuming we go with one generated oid symbol file per header, my
patch would end up with something like

pg_type.h column symbols (#includes pg_type_sym.h)
pg_type_fn.h function decls
pg_type_sym.h oid symbols (generated)

And you're saying you'd prefer

pg_type.h function decls (#includes pg_type_sym.h)
pg_type_sym.h oid symbols, column symbols (generated)

I agree that it'd be messy to drive the generation of the column
symbols. I'll think about it. What about

pg_type.h function decls (#includes pg_type_sym.h)
pg_type_sym.h column symbols (static, #includes pg_type_oids.h)
pg_type_oids.h oid symbols (generated)

It's complicated, but arguably no more so than finding someplace more
distant to stick the column symbols and writing code to copy them.
It'd be about than 20 *_sym.h headers and 10 *_oids.h headers.

> The .dat files need to have header comments that follow project
> conventions, in particular they need to contain copyright statements.
> Likewise for generated files.

Okay.

> I've got zero faith that the .h files will hold still long enough
> for these patches to be reviewed and applied. The ones that touch
> significant amounts of data need to be explained as "run this script
> on the current data", rather than presented as static diffs.

I've already rebased over a catalog change and it was not much fun, so
I'd be happy to do it this way.

> I'm not really thrilled by the single-purpose "magic" behaviors added
> in 0007, such as computing prosrc from proname. I think those will
> add more confusion than they're worth.

Okay. I still think generating pg_type OID symbols is worth doing, but
I no longer think this is a good place to do it.

> In 0010, you relabel the types of some OID columns so that genbki.pl
> will know which lookup to apply to them. That's not such a problem for
> the relabelings that are just macros and genbki.pl converts back to
> type OID in the .bki file. But you also did things like s/Oid/regtype/,
> and that IS a problem because it will affect what client code sees in
> those catalog columns. We've discussed changing those columns to
> regfoo types in the past, and decided not to, because of the likelihood
> of breaking client queries. I do not think this patch gets to change
> that policy. So the way to identify the lookup rule needs to be
> independent of whether the column is declared as Oid or an Oid alias type.
> Perhaps an explicit marker telling what transformation to make, like
>
> Oid rngsubtype BKI_LOOKUP(pg_type);
>
> would work for that.

Okay. I fail to see how client queries are affected, since I change
everything back to oid, but I think your design is cleaner anyway.

> I'm not really on board at all with 0012, which AFAICS moves the indexing
> and toast-table information out of indexing.h and toasting.h for no good
> reason whatever. We'll have quite enough code thrash and pending-patch
> breakage from this patch set; we don't need to take on rearrangements that
> aren't buying anything.

I don't have a convincing rebuttal, so I'll withdraw it.
--

[1] https://www.postgresql.org/message-id/25254.1248533810%40sss.pgh.pa.us

-John Naylor

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Ildus Kurbangaliev 2018-01-14 21:49:30 Re: [HACKERS] Custom compression methods
Previous Message Tom Lane 2018-01-14 16:44:30 Re: WIP: a way forward on bootstrap data