Re: [PATCH] Generic type subscripting

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Generic type subscripting
Date: 2017-03-14 23:10:13
Message-ID: 20423.1489533013@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dmitry Dolgov <9erthalion6(at)gmail(dot)com> writes:
> [ generic_type_subscription_v7.patch ]

I looked through this a bit.

I think that the basic design of having a type-specific parse analysis
function that returns a constructed SubscriptingRef node is fine.

I'm not totally excited about the naming you've chosen though,
particularly the function names "array_subscripting()" and
"jsonb_subscripting()" --- those are too generic, and a person coming to
them for the first time would probably expect that they actually execute
subscripting, when they do no such thing. Names like
"array_subscript_parse()" might be better. Likewise the name of the
new pg_type column doesn't really convey what it does, though I suppose
"typsubscriptparse" is too much of a mouthful. "typsubparse" seems short
enough but might be confusing too.

I wonder also if we should try to provide some helper functions rather
than expecting every data type to know all there is to know about parsing
and execution of subscripting. Not sure what would be helpful, however.

One thing that needs more thought for sure is the nested-assignment case
(the logic around isAssignmentIndirectionExpr) --- the design you've got
here implies that *every* container-like datatype would need to duplicate
that logic, and I don't think we want to go there.

The documentation needs a lot of work of course, and I do not think
you're doing either yourself or anybody else any favors with the proposed
additions to src/tutorial/. You'll never be sure that that stuff even
compiles let alone accurately represents what people need to do. Actual
running code is much better. It may be that jsonb_subscript is enough
of an extension example, but perhaps adding a subscripting feature to some
contrib module would be better.

Aren't SBS_VALIDATION and SBS_EXEC just hangovers from the previous
design? They're still in parse_node.h, and they're still mentioned in
the docs, but I don't see them used in actual code anywhere.
get_slice_arguments seems to be a hangover as well, which is good
because it's mighty ugly and undocumented.

It seems rather silly for ExecEvalSubscriptingRef to be palloc'ing some
per-subscript arrays each time through when it's got other arrays that are
still of fixed size MAXDIM. I can believe that it might be a good idea
to increase or remove the MAXDIM limit, but this doesn't do it. In any
case, you don't want to add the overhead of a couple of pallocs per
execution. Using OidFunctionCall2 is even worse: that's adding a system
catalog lookup per execution. You need to be caching the function address
as is done for regular function and operator calls. (I take it you've not
done performance testing yet.)

I'm not really finding this to be an improvement:
- errmsg("array subscript in assignment must not be null")));
+ errmsg("container subscript in assignment must not be null")));
"Container" is going to seem like jargon to users. Maybe it'd be okay to
drop the word altogether and just say "subscript in assignment must not be
null". (Another question here is whether every datatype will be on board
with the current rules about null subscripts, or whether we need to
delegate null-handling to the datatype-specific execution function.
I'm not sure; it would complicate the API significantly for what might be
useless flexibility.)

I'm tempted to propose that it'd be a good idea to separate the regular
(varlena) array code paths from the fixed-length-array code paths
altogether, which you could do in this implementation by having separate
execution functions for them. That would buy back some fraction of
whatever overhead we're adding with the additional level of function call.
Maybe even separate execution functions for the slice and not-slice
cases, though that might be overkill.

I'm not on board with these APPLY_OPERATOR_TO_TYPE macros. If you
think you have a cute idea for improving the notation in the node support
files, great; submit a patch that changes all of the support functions
at once. Patches that introduce one or two support functions that look
radically different from all the rest are not acceptable.

Likewise, what you did in places like JumbleExpr is too cute by half.
Just make two separate cases for the two new node types. You're not
saving enough code to justify the additional intellectual complexity
and maintenance burden of doing it like that.

I do not think that the extra SubscriptingBase data structure is paying
for itself either; you're taking a hit in readability from the extra level
of struct, and as far as I can find it's not buying you one single thing,
because there's no code that operates on a SubscriptingBase argument.
I'd just drop that idea and make two independent struct types, or else
stick with the original ArrayRef design that made one struct serve both
purposes. (IOW, my feeling that a separate assign struct would be a
readability improvement isn't exactly getting borne out by what you've
done here. But maybe there's a better way to do that.)

I wouldn't suggest putting a lot of work into the execQual.c part of the
patch right now, as execQual.c is going to look completely different if
Andres' patch gets in. Better to concentrate on cleaning up the parsenode
struct types and thinking about a less messy answer for nested assignment.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2017-03-14 23:10:19 pgsql: Improve isolation tests infrastructure.
Previous Message Peter Geoghegan 2017-03-14 22:51:40 Re: GUC for cleanup indexes threshold.