Simplifying OID lookups in the presence of namespaces

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: Simplifying OID lookups in the presence of namespaces
Date: 2002-04-22 20:23:57
Message-ID: 26213.1019507037@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

We are about to need to fix a fair number of places in client code
(eg, psql and pg_dump) that presently do things like

SELECT * FROM pg_attribute
WHERE attrelid = (SELECT oid FROM pg_class WHERE relname = 'foo');

This does not work reliably anymore because there could be multiple
relations named 'foo' in different namespaces. The sub-select to
get the relation OID will fail because it'll return multiple results.

The brute-force answer is

SELECT * FROM pg_attribute
WHERE attrelid = (SELECT oid FROM pg_class WHERE relname = 'foo'
AND relnamespace = (SELECT oid FROM pg_namespace
WHERE nspname = 'bar'));

But aside from being really ugly, this requires that the client code
know exactly which namespace contains the relation it's after. If
the client is relying on namespace search then it may not know that;
in fact, the client code very possibly isn't even aware of the exact
namespace search path it's using. I am planning to introduce an
informational function CURRENT_SCHEMAS() (or some such name) that
returns the current effective search path, probably as a NAME[] array.
But it looks really, really messy to write an SQL query that makes
use of such a function to look up the first occurrence of 'foo' in
the search path. We need to encapsulate the lookup procedure somehow
so that we don't have lots of clients reinventing this wheel.

We already have some functions that accept a text string and do a
suitable lookup of a relation; an example is nextval(), for which
you can presently write

nextval('foo') --- searches namespace path for foo
nextval('foo.bar') --- looks only in namespace foo
nextval('"Foo".bar') --- quoting works for mixed-case names

Seems like what we want to do is make the lookup part of this available
separately, as a function that takes such a string and returns an OID.
We'd need such functions for each of the namespace-ified object kinds:
relations, datatypes, functions, and operators.

A variant of the idea of inventing functions is to extend the existing
datatype 'regproc' to do this, and invent also 'regclass', 'regtype',
'regoperator' datatypes to do the lookups for the other object kinds.
I proposed this in a different context last year,
http://archives.postgresql.org/pgsql-hackers/2001-08/msg00589.php
but it seemed too late to do anything with the idea for 7.2.

If we went with the datatype approach then we'd be able to write
queries like

SELECT * FROM pg_attribute WHERE attrelid = 'foo'::regclass;

or

SELECT * FROM pg_attribute WHERE attrelid = 'foo.bar'::regclass;

or for that matter you could do

SELECT * FROM pg_attribute WHERE attrelid = regclass('foo');

which'd be syntactically indistinguishable from using a function.

The datatype approach seems a little bit odder at first glance, but it
has some interesting possibilities with respect to implicit casting
(see above-referenced thread). So I'm inclined to go that route unless
someone's got an objection.

With a datatype, we also have outbound conversion to think of: so there
must be a function that takes an OID and produces a string. What I am
inclined to do on that side is emit an unqualified name if the OID
refers to a relation/type/etc that would be found first in the current
namespace search path. Otherwise, a qualified name (foo.bar) would be
emitted. This will have usefulness for applications like pg_dump, which
will have exactly this requirement (per discussion a few days ago that
pg_dump should not qualify names unnecessarily).

One question is what to do with invalid input. For example, if table
foo doesn't exist then what should 'foo'::regclass do? The existing
regproc datatype throws an error, but I wonder whether it wouldn't be
more useful to return NULL. Any thoughts on that?

Also, for functions and operators the name alone is not sufficient to
uniquely identify the object. Type regproc currently throws an error
if asked to convert a nonunique function name; that severely limits its
usefulness. I'm toying with allowing datatypes in the input string,
eg
'sum(bigint)'::regproc
but I wonder if this will create compatibility problems. In particular,
should the regproc and regoperator output converters include datatype
indicators in the output string? (Always, never, only if not unique?)
Doing so would be a non-backwards-compatible change for regproc.
We might avoid that complaint by leaving regproc as-is and instead
inventing a parallel datatype (say regfunction) that supports datatype
indications. But I'm not sure whether regproc is used enough to make
this an important concern.

Comments?

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dann Corbit 2002-04-22 21:15:37 What is wrong with hashed index usage?
Previous Message Patrick Macdonald 2002-04-22 17:57:31 Re: Documentation on page files