Re: proposal: array utility functions phase 1

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: proposal: array utility functions phase 1
Date: 2002-12-09 18:13:49
Message-ID: 15414.1039457629@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Joe Conway <mail(at)joeconway(dot)com> writes:
> Yes, this is exactly what I was yearning to do. Was there a spec or technical
> reason (or both) for not allowing the following?

> select * from array_values(g.grolist), pg_group g where g.groname = 'g2';

This seems fairly unworkable to me as-is. By definition, WHERE selects
from a cross-product of the FROM tables; to make the above do what you
want, you'd have to break that fundamental semantics. The semantics of
explicit JOIN cases would be broken too.

What we need is some kind of explicit multi-level SELECT operation.
Perhaps it would help to think about the analogy of aggregates of
aggregate functions, which are impossible to express properly in a
single SELECT but work nicely given subselect-in-FROM.
Subselect-in-FROM doesn't seem to get this job done though.

Right offhand I don't see any reasonable syntax other than
function-in-the-SELECT-list, which shoots us right back into the
messinesses of the Berkeley implementation. However, we do now have the
precedent of the table-function AS clause. Does it help any to do
something like

SELECT grosysid, array_values(grolist) AS (array_index,member_id)
FROM pg_group where groname = 'g2';

(Again you could wrap this in an outer SELECT to transform the
member_ids to member_names.)

The real problem with the Berkeley approach shows up when you consider
what happens with multiple table functions called in a single SELECT.
The code we currently have produces the cross-product of the implied
rows (or at least it tries to, I seem to recall that it doesn't
necessarily get it right). That's pretty unpleasant, and though you can
filter the rows in an outer SELECT, there's no way to optimize the
implementation into a smarter-than-nested-loop join.

It seems like somehow we need a level of FROM/WHERE producing some base
rows, and then a set of table function calls to apply to each of the
base rows, and then another level of WHERE to filter the results of the
function calls (in particular to provide join conditions to identify
which rows to match up in the function outputs). I don't see any way to
do this without inventing new SELECT clauses out of whole cloth
... unless SQL99's WITH clause helps, but I don't think it does ...

> How ugly/difficult would it be to allow the planner to interrogate the
> function and let the function report back a tupledesc based on the actual
> runtime input parameters?

Parse-time, not run-time. It could be done --- IIRC, the auxiliary
"function info" call we introduced in the V1 fmgr protocol was
deliberately designed to allow expansion in this sort of direction.
But it would have to take a tupledesc (or some similar static
description) and return another one.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Lamar Owen 2002-12-09 18:34:44 Re: [GENERAL] PostgreSQL Global Development Group
Previous Message Bruce Momjian 2002-12-09 18:08:29 Re: IPv6 patch rejected

Browse pgsql-patches by date

  From Date Subject
Next Message Peter Eisentraut 2002-12-09 19:48:55 Re: show casts and conversions in psql (2nd try)
Previous Message Bruce Momjian 2002-12-09 18:11:29 Re: vacuumlo patch