Re: Portal->commandTag as an enum

From: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Portal->commandTag as an enum
Date: 2020-02-19 02:40:34
Message-ID: 6288426F-DBFE-4AF1-8737-27354018C7FA@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Feb 11, 2020, at 1:05 PM, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com> wrote:
>
>> On Feb 11, 2020, at 1:02 PM, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
>>
>> On 2020-Feb-11, Mark Dilger wrote:
>>
>>>> No thanks.
>>>
>>> I’m not sure which option you are voting for:
>>>
>>> (Option #1) Have the perl script generate the .c and .h file from a .dat file
>>>
>>> (Option #2) Have the perl script validate but not generate the .c and .h files
>>>
>>> (Option #3) Have no perl script, with all burden on the programmer to get the .c and .h files right by hand.
>>>
>>> I think you’re voting against #3, and I’m guessing you’re voting for #1, but I’m not certain.
>>
>> I was voting against #2 (burden the programmer with consistency checks
>> that must be fixed by hand, without actually doing the programmatically-
>> doable work), but I don't like #3 either. I do like #1.
>
> Option #1 works for me. If I don’t see any contrary votes before I get back to this patch, I’ll implement it that way for the next version.

While investigating how best to implement option #1, I took a look at how Catalog.pm does it.

Catalog.pm reads data files and eval()s chunks of them to vivify perl data.

# We're treating the input line as a piece of Perl, so we
# need to use string eval here. Tell perlcritic we know what
# we're doing.
eval '$hash_ref = ' . $_; ## no critic (ProhibitStringyEval)

This would only make sense to me if the string held in $_ had already been checked for safety, but Catalog.pm does very little to verify that the string is safe to eval. The assumption here, so far as I can infer, is that we don’t embed anything dangerous in our .dat files, so this should be ok. That may be true for the moment, but I can imagine a day when we start embedding perl functions as quoted text inside a data file, or shell commands as text which look enough like perl for eval() to be able to execute them. Developers who edit these .dat files and mess up the quoting, and then rerun ‘make’ to get the new .c and .h files generated, may not like the side effects. Perhaps I’m being overly paranoid….

Rather than add more code generation logic based on the design of Catalog.pm, I wrote a perl based data file parser that parses .dat files and returns vivified perl data, as Catalog.pm does, but with much stricter parsing logic to make certain nothing dangerous gets eval()ed. I put the new module in DataFile.pm. The commandtag data has been consolidated into a single .dat file. A new perl script, gencommandtag.pl, parses the commandtag.dat file and generates the .c and .h files. So far, only gencommandtag.pl uses DataFile.pm, but I’ve checked that it can parse all the .dat files currently in the source tree.

The new parser is more flexible about the structure of the data, which seems good to me for making it easier to add or modify data files in the future. The new parser does not yet have a means of hacking up the data to add autogenerated fields and such that Catalog.pm does, but I think a more clean break between parsing and autovivifying fields would be good anyway. If I get generally favorable reviews of DataFile.pm, I might go refactor Catalog.pm. For now, I’m just leaving Catalog.pm alone.

> That script can then generate all
> the needed .c and .h files, which are not going to be part of the source
> tree, and will always be in-sync and won't have the formatting
> strictness about it. And you won't have the Martian syntax you had to
> use in the commandtag.c file.

I still have the “Martian syntax”, though now it is generated by the perl script. I can get rid of it, but I think Andres liked the Martian stuff.

> We don't adhere to that 100%, and some people dislike it, but I'd rather
> be consistent and not be passing "FooData *" around; it's just noisier.

I renamed QueryCompletionData to just QueryCompletion, and I’m passing pointers to that.

Attachment Content-Type Size
v4-0001-Migrating-commandTag-from-string-to-enum.patch application/octet-stream 146.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2020-02-19 02:52:23 Re: pg_trigger.tgparentid
Previous Message Amit Langote 2020-02-19 02:22:07 Re: pg_stat_progress_basebackup - progress reporting for pg_basebackup, in the server side