Re: [PATCH] Generic type subscripting

From: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
To: David Steele <david(at)pgmasters(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: peter(dot)eisentraut(at)2ndquadrant(dot)com, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Generic type subscripting
Date: 2017-03-21 20:42:06
Message-ID: CA+q6zcUX=Om75kgZXkpYPhnwUCo4osFpzd83dKPzjSGfCmxEgA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On 21 March 2017 at 18:16, David Steele <david(at)pgmasters(dot)net> wrote:
>
> This thread has been idle for over a week.

Yes, sorry for the late reply. I'm still trying to find a better solution
for
some of the problems, that arose in this patch.

> On 15 March 2017 at 00:10, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Dmitry Dolgov <9erthalion6(at)gmail(dot)com> writes:
>
> I looked through this a bit.
>

Thank you for the feedback.

> > On 10 March 2017 at 06:20, Peter Eisentraut <
peter(dot)eisentraut(at)2ndquadrant(dot)com> wrote:
> > I see a possible problem here: This design only allows one subscripting
> > function. But what you'd really want in this case is at least two: one
> > taking an integer type for selecting by array index, and one taking text
> > for selecting by field name.
>
> I would guess that what we really want for jsonb is the ability to
> intermix integer and text subscripts, so that
> jsonbcol['foo'][42]['bar']
> would extract the "bar" field of an object in position 42 of an
> array in field "foo" of the given jsonb value.
>

Maybe I misunderstood you, but isn't it already possible?

```
=# select ('{"a": [{"b": 1}, {"c": 2}]}'::jsonb)['a'][0]['b'];
jsonb
-------
1
(1 row)

=# select * from test;
data
-----------------------------
{"a": [{"b": 1}, {"c": 2}]}
(1 row)

=# update test set data['a'][0]['b'] = 42;
UPDATE 1

=# select * from test;
data
-----------------------------
{"a": [{"b": 42}, {"c": 2}]}
(1 row)
```

> I'm not totally excited about the naming you've chosen though,
> particularly the function names "array_subscripting()" and
> "jsonb_subscripting()" --- those are too generic, and a person coming to
> them for the first time would probably expect that they actually execute
> subscripting, when they do no such thing. Names like
> "array_subscript_parse()" might be better. Likewise the name of the
> new pg_type column doesn't really convey what it does, though I suppose
> "typsubscriptparse" is too much of a mouthful. "typsubparse" seems short
> enough but might be confusing too.

It looks quite tempting for me to replace the word "subscript" by an
abbreviation all over the patch. Then it will become something like
"typsbsparse" which is not that mouthful, but I'm not sure if it will be
easily
recognizable.

> I wonder also if we should try to provide some helper functions rather
> than expecting every data type to know all there is to know about parsing
> and execution of subscripting. Not sure what would be helpful, however.

I don't really see what details we can hide behind this helper, because
almost
all code there is type specific (e.g. to check if subscript indexes are
correct), can you elaborate on that?

> The documentation needs a lot of work of course, and I do not think
> you're doing either yourself or anybody else any favors with the proposed
> additions to src/tutorial/.

Yes, unfortunately, I forget to update documentation from the previous
version
of the patch. I'll fix it soon in the next version.

> Aren't SBS_VALIDATION and SBS_EXEC just hangovers from the previous
> design? They're still in parse_node.h, and they're still mentioned in
> the docs, but I don't see them used in actual code anywhere.

Yes, these are from the previous version too, I'll remove them.

> Another question here is whether every datatype will be on board
> with the current rules about null subscripts, or whether we need to
> delegate null-handling to the datatype-specific execution function.
> I'm not sure; it would complicate the API significantly for what might be
> useless flexibility.

It looks for me that it's a great idea to perform null-handling inside
datatype
specific code and I'm not sure, what would be complicated? All necessary
information for that is already in `SubscriptingExecData` (I just have to
use
`eisnull` in a different way).

> I do not think that the extra SubscriptingBase data structure is paying
> for itself either; you're taking a hit in readability from the extra level
> of struct, and as far as I can find it's not buying you one single thing,
> because there's no code that operates on a SubscriptingBase argument.
> I'd just drop that idea and make two independent struct types, or else
> stick with the original ArrayRef design that made one struct serve both
> purposes. (IOW, my feeling that a separate assign struct would be a
> readability improvement isn't exactly getting borne out by what you've
> done here. But maybe there's a better way to do that.)

I'm thinking to replace these structures by more meaningful ones, something
like:

```
typedef struct SubscriptContainerRef
{
Expr xpr;
Oid refcontainertype;
Oid refelemtype;
int32 reftypmod;
Oid refcollid;
} SubscriptBase;

typedef struct SubscriptExtractRef
{
Expr xpr;
SubscriptContainerRef *containerExpr;
List *refupperindexpr;
List *reflowerindexpr;
Oid refevalfunc;
} SubscriptExtractRef;

typedef struct SubscriptAssignRef
{
Expr xpr;
SubscriptContainerRef *containerExpr;
Expr *refassgnexpr;
List *refupperindexpr;
List *reflowerindexpr;
Oid refevalfunc;
} SubscriptAssignRef;

```

It's close to the previous version, but here will be a bit less duplicated
code
in comparison with two independent structures, we still can use different
nodes
for data assignment and data extraction, and node processing for these nodes
can be little bit more separated, e.g.:

```
case T_SubscriptExtractRef:
{
// extract specific logic
sbsref->containerExpr = ExecInitExpr((Expr *)
sbsref->containerExpr, parent);
}
case T_SubscriptAssignRef:
{
// assign specific logic
sbsref->containerExpr = ExecInitExpr((Expr *)
sbsref->containerExpr, parent);
}
case T_SubscriptContainerRef:
{
// subscript container logic
}
```

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Denish Patel 2017-03-21 20:43:30 Re: Monitoring roles patch
Previous Message Vitaly Burovoy 2017-03-21 20:11:07 Re: identity columns