Re: array support patch phase 1 patch

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: "Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>
Subject: Re: array support patch phase 1 patch
Date: 2003-04-09 19:00:20
Message-ID: 9274.1049914820@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches

Joe Conway <mail(at)joeconway(dot)com> writes:
> Here's an updated patch that fixes the above mentioned issues. I also
> added significantly more functionality.

I've applied much but not all of this. Some comments:

I didn't apply any of the aggregate changes, because I'm unsure of how
we actually want to do that. AFAICT you've set it up so that the
polymorphism of the transition function is hidden within the aggregate,
and one must still create a separate aggregate for each input datatype.
Couldn't we fix things so that a polymorphic transition function leads
to a polymorphic aggregate? I'm not totally sure that makes sense, but
it seems worth thinking about.

Also I didn't put in the bool_op stuff. That seemed pretty messy; in
particular I didn't care for looking at the operator names to decide
what to do. Another problem is that the actual lookup of the scalar
operators would be schema search path dependent. I'd feel more
comfortable with something that created a tighter binding of the array
operators to the underlying scalar operators. Not sure how to do it,
though.

There are a number of remaining annoyances in array type assignment and
coercion. Here's one:

create table t1 (p point[]);
insert into t1 values(array['(3,4)','(4,5)','(6,7)']);
ERROR: column "p" is of type point[] but expression is of type text[]
You will need to rewrite or cast the expression
Instead you have to write
insert into t1 values(array['(3,4)'::point,'(4,5)','(6,7)']);

I'm not sure if there's any way around this, but it certainly reduces the
usefulness of untyped literals. Right offhand it seems like maybe we could
invent an UNKNOWNARRAY type so as to postpone the decision about what the
array's type should be --- but I'm worried about possible side-effects.
I think there are places where the existence of "unknown[]" might allow
the type resolution logic to follow paths we don't want it to follow.

I think there are a lot of places where binary-compatible element types
and domain element types will not be treated reasonably. We need to think
about where and how to handle that.

Another problem is:

regression=# select distinct array(select * from text_tbl) from foo;
ERROR: Unable to identify an equality operator for type text[]

This DISTINCT query would fail anyway of course, for lack of a '<'
operator for arrays, but it seems like we ought to be able to find the
polymorphic '=' operator.

We really ought to reimplement array_eq to be less bogus: instead of bit
equality it ought to be applying the element type's equality operator.

I rearranged the way that parse_coerce and function/operator overloading
resolution handle arrays. It seems cleaner to me, but take a look and
see what you think.

One thing I didn't like about the implementation of many of these functions
is that they were willing to repeat catalog lookups on every call. I fixed
array_type_coerce to cache lookup results, please see if you can apply the
technique elsewhere.

The selectivity estimation functions you'd assigned to the bool_ops operators
seemed like pretty bogus choices. I am not sure we can use scalar estimators
for these at all --- perhaps a new set of estimators needs to be written.

> create table foo(f1 integer ARRAY);
> create table foo(f1 integer ARRAY[]);
> create table foo(f1 integer ARRAY[x]);
> create table foo(f1 integer ARRAY[][]); (and more [] etc)
> create table foo(f1 integer ARRAY[x][y]); (and more [] etc)

AFAICT, SQL99 specifies the syntax "typename ARRAY [ intconst ]" and
nothing else. I do not see a reason to accept ARRAY without [],
ARRAY with empty [], ARRAY with multiple [], etc if the spec doesn't
make us do so. The existing Postgres syntax without the word ARRAY
gets the job done just fine.

regards, tom lane

In response to

Responses

Browse pgsql-patches by date

  From Date Subject
Next Message Joe Conway 2003-04-09 22:22:10 Re: array support patch phase 1 patch
Previous Message Peter Jones 2003-04-09 18:15:46 Modern C++ Interface