Re: WIP: Generic functions for Node types using generated metadata

From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: Generic functions for Node types using generated metadata
Date: 2019-09-20 22:43:54
Message-ID: 20190920224354.jihya5waks642e6s@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2019-09-19 22:18:57 -0700, Andres Freund wrote:
> While working on this I evolved the node string format a bit:
>
> 1) Node types start with the their "normal" name, rather than
> uppercase. There seems little point in having such a divergence.
>
> 2) The node type is followed by the node-type id. That allows to more
> quickly locate the corresponding node metadata (array and one name
> recheck, rather than a binary search). I.e. the node starts with
> "{Scan 18 " rather than "{SCAN " as before.
>
> 3) Nodes that contain other nodes as sub-types "inline", still emit {}
> for the subtype. There's no functional need for this, but I found the
> output otherwise much harder to read. E.g. for mergejoin we'd have
> something like
>
> {MergeJoin 37 :join {Join 35 :plan {Plan ...} :jointype JOIN_INNER ...} :skip_mark_restore true ...}
>
> 4) As seen in the above example, enums are decoded to their string
> values. I found that makes the output easier to read. Again, not
> functionally required.
>
> 5) Value nodes aren't emitted without a {Value ...} anymore. I changed
> this when I expanded the WRITE/READ tests, and encountered failures
> because the old encoding is not entirely rountrip safe
> (e.g. -INT32_MIN will be parsed as a float at raw parse time, but
> after write/read, it'll be parsed as an integer). While that could be
> fixed in other ways (e.g. by emitting a trailing . for all floats), I
> also found it to be clearer this way - Value nodes are otherwise
> undistinguishable from raw strings, raw numbers etc, which is not
> great.
>
> It'd also be easier to now just change the node format to something else.

E.g. to just use json. Which'd certainly be a lot easier to delve into,
given the amount of tooling (both on the pg SQL level, and for
commandline / editors / etc). I don't think it'd be any less
efficient. There'd be a few more = signs, but the lexer is smarter /
faster than the one currently in use for the outfuncs format. And we'd
just reuse pg_parse_json rather than having a dedicated parser.

- Andres

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Steele 2019-09-20 23:11:47 Re: backup manifests
Previous Message Tom Lane 2019-09-20 22:17:02 Re: subscriptionCheck failures on nightjar