Re: [PATCH] Teach Catalog.pm how many attributes there should be per DATA() line

From: David Christensen <david(at)endpoint(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Teach Catalog.pm how many attributes there should be per DATA() line
Date: 2015-10-09 20:27:18
Message-ID: 96ADB358-39AC-4546-B281-F5B2A18A2D78@endpoint.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


> On Oct 9, 2015, at 2:17 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Thu, Oct 8, 2015 at 12:43 PM, David Christensen <david(at)endpoint(dot)com> wrote:
>> I’m happy to move it around, but If everything is in order, how will this affect things at all? If we’re in a good state this condition should never trigger.
>
> Right, but I think it ought to be Catalog.pm's job to parse the config
> file. The job of complaining about what it contains is a job worth
> doing, but it's not the same job. Personally, I hate it when parsers
> take it upon themselves to do semantic analysis, because then what
> happens if you want to write, say, a tool to repair a broken file?
> You need to be able to read it in without erroring out so that you can
> frob whatever's busted and write it back out, and the parser is
> getting in your way. Maybe that's not going to come up here, but I'm
> just explaining my general philosophy on these things…

Not disagreeing with you in general, but this is a very specific use case and I think we lose the niceness of being able to tie back to the specific line number for the file in question—the alternative being to track that information as well in a separate structure which we then pass around, which seems like overkill.

The only two consumers of the catalog-specific data lines (at least via direct access in Perl) are genbki.pl and Gen_fmgtab.pl. We would need to add these checks anyway in both call sites, so to me it seems important to bail early if we see any issues, so I think putting the failure as soon as we notice it with as much context to fix it (i.e., as written) is the right choice. We can certainly pretty up the messages.

The consistency of the system catalogs in the development state is something that is fundamental to whether there is any information that is sensible to query, and by definition if we are missing columns in the data rows this is a mistake and whatever parsed data in here will be worse than useless (as who knows the order of the missing column, data can/will end up being misaligned). Thus I don’t believe that we’d want other (hypothetical) Catalog.pm consumers to try to use data that we know is bad.
--
David Christensen
PostgreSQL Team Manager
End Point Corporation
david(at)endpoint(dot)com
785-727-1171

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim Nasby 2015-10-09 20:31:03 Questionable behavior regarding aliasing
Previous Message Alexander Korotkov 2015-10-09 20:23:56 Re: Some questions about the array.