Schemas vs. PostQUEL: resolving qualified identifiers

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: Schemas vs. PostQUEL: resolving qualified identifiers
Date: 2002-01-19 00:29:43
Message-ID: 11093.1011400183@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I'm starting to think about how to do SQL-compatible schemas in Postgres.

One of the first issues that comes up is that Postgres has interpretations
of qualified (dotted) names that conflict with what the standard says to
do. This stuff is a hangover from Berkeley days and PostQUEL. Briefly,
since there are no schemas in PG, anytime we have a dotted name we know
that the first component *must* be a table name. Successive components are
then resolved as either column or function names applied to whatever we
have so far. For example,

a.b.c is equivalent to c(a.b) if c is a function

The equivalence works the other way too: you can write colname(tabname)
and it will be taken as tabname.colname (though this seems to work only
if tabname already has a rangetable entry).

This is not going to fly together with SQL92, which wants a.b.c to mean
schema a, table b, column c.

I believe we can resolve the conflict without breaking any cases that
are likely to be used much in practice. Here's my proposal:

First, I'd like to allow the notation "table.*" to be used as a function
argument, representing passing a whole-row value to a function (the
function's argument would be declared as the table rowtype). Presently
this is done by writing just the undecorated table name, but that is
ambiguous if the same name is also used as a column name in the query.
Also, we need this notation for use with qualified table names.

Having done that, we can resolve names appearing in expressions thus:

foo First try to resolve as unqualified column name;
if not found, try to resolve as unqualified table name
(in which case this is a whole-row value, equivalent
to foo.*).

foo.bar foo is an unqualified table name. Try first to resolve
bar as a column name of foo; if not found, try to resolve
bar as a function taking the rowtype of foo.

foo.bar.baz This refers to table bar in schema foo. baz is either
a column or function name, as above.

foo.bar.baz.quux Refers to table baz in schema bar in catalog foo.
quux is either a column or function name, as above.

foo.* foo is an unqualified table name; means whole-row value.

foo.bar.* Whole row of table bar in schema foo.

foo.bar.baz.* Whole row of table baz in schema bar in catalog foo.

The first two of these rules are the same as current behavior; the next
two are additions mandated by SQL92; the last three are a proposed
extension to allow unambiguous reference to whole-row values.

With these rules, we do not lose any functionality that we have now,
but some PostQUEL-ish constructs will need to be rewritten into an
equivalent form.

PostQUEL-isms that still work:

table.func equivalent to func(table.*)

func(table) equivalent to func(table.*), as long as table name
doesn't match any column name of the query

PostQUEL-isms that no longer work:

table.col.func Must rewrite as func(table.col)

table.func1.func2 Must rewrite as, eg, func2(func1(table.*))

While there are a couple of examples of the above two constructs in the
regression tests, I doubt they are used much in the real world.

Another PostQUEL-ism that I would like to remove is col(tab) to mean
tab.col; I'd prefer to restrict the function syntax to functions only.
We could continue to support this for unqualified table names, but
given the above rules it could not work for a qualified table name,
eg, col(schema.tab) would be misinterpreted. Again, it seems unlikely
that many people are using this, so we may as well try to regularize
the syntax as much as we can.

You may wonder why I'm still allowing the construction tab.func ---
why not get rid of that too? Well, mainly because I'd like to support
the Oracle-ish syntax for nextval: seqname.nextval. Together with the
desire not to break existing code unnecessarily, that seems a good enough
reason to leave it in.

Comments?

regards, tom lane

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2002-01-19 00:57:05 Re: pg_dump question
Previous Message Laurette Cisneros 2002-01-19 00:26:29 pg_dump question