Re: search_path improvements

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Greg Stark <stark(at)enterprisedb(dot)com>
Cc: "David E(dot) Wheeler" <david(at)kineticode(dot)com>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org, Stephen Frost <sfrost(at)snowman(dot)net>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Subject: Re: search_path improvements
Date: 2009-05-31 20:12:36
Message-ID: 4A22E4B4.9060304@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greg,

> What's the point of "namespaces" if not to implement visibility? The
> interesting thing to do would be to hide all the internal foo
> functions in a foo.* schema and only put the external api in public.

That is an interesting idea. However, what our real users are really
doing in the field is more diverse. (see below)

> That way you can't accidentally call an internal foo function or have
> a name conflict between two internal functions. The external api could
> even just be a bunch of thin wrappers around the implementation
> functions in foo.* (what Oracle calls public synonyms).

This assumes that all users should have access to the same public APIs
as all other users. Real applications are more complex.

In my experience of PostgreSQL applications, people use schema for three
different reasons:

Organization/Maintainability: when you have hundreds or thousands of
database objects, you want "folders" to put them in just so that you can
keep track of them and view them in easy-to-digest groups, just as you
deal with files in a filesystem. DBAs no more want to put everything in
one big flat namespace, even if the individual names are unique and the
permissions are the same, than we want to have all of the PostgreSQL
source code in one big directory.
Further, these schema names generally indicate something about the
purpose of the objects in them: "cms","matviews", "reports","calendar".
When accurate, these schema names aid the DBA in maintaining and
troubleshooting the application, and are more convenient than hungarian
notation schemes.

Visibility: some applications use schema to hide objects from roles
which shouldn't see them: "inner","cronjobs","acl", whether for data
hiding or just to keep "private" functions and tables separate from what
the application accesses directly. However, this approach is not very
common *because of* the awkwardness and overhead of search_path; DBAs
are constantly troubleshooting search_path omissions and errors and
eventually give up on visibility rules, making all schema visible to all
users.
This gets even more difficult when you consider that in a large complex
application with multiple ROLEs, not all ROLEs should see all schema,
but what an individual user can access might be a list of schema which
represent some-but-not-all schema. The lack of a convenient
"search_path_add" or "SET ROLE ... WITH DEFAULTS" makes this an
unmanageable mess; DBAs find themselves constantly ALTERing each user's
search_path individually.

Security: schema provide convenient containers to lock up groups of
objects by role. "admin", "permissions" etc. schemas combine visibility
and USE restrictions to make sql injection much harder, and
administrative tasks are supported by objects in schema not accessible
to the "webuser".

As I said before, schema conbine 3 purposes: organization, visibility
and security, into one structure. Which is why it's difficult to make
them work perfectly for all 3 purposes. We could, however, make them
work better.

>> c) the ability as superuser to have my own "special schema" which are always
>> in the search path, as pg_catalog and $user_temp are.*
>
>> * if you're not sure why someone would want this, consider
>> information_schema. If your application depends on I_S to work, how do you
>> make sure it's always in every user's search_path?
>
> Uhm, wouldn't you just refer to information_schema.foo? What if some
> other part of your application depends on information_schema *not*
> being in your path? Using global state for this seems destined to
> leave you with something broken that can't be fixed without breaking
> something else.

Easily said for someone who doesn't have to adapt a 3rd-party vendor
application or support real users on the phone. Insisting that all of
your application developers remember to type "information_schema." all
of the time really makes them love the DBA. Mostly, this simply results
in people not using information_schema, and instead using their own
home-grown system view scripts, which are often wrong.

However, if we had push/pop/shift/unshift for search_path, the need for
search_path_suffix would be considerably diminished, since application
code (& DBAs) would use push/pop instead of replacing the entire
search_path.

> Hm, I'm beginning to think extensions need to have search_path set on
> every function or have every object reference everywhere be explicitly
> pg_extension.* (and/or _private_.* like my earlier suggestion).

Again, I'm not talking about Extensions. I think that Extensions are
completely orthagonal to search_path, hence the change of subject line.
I'm talking about making search_path (and schema) more useful to DBAs
and application designers.

>> e) having roles somehow inherit search_path on a SET ROLE***
>
> Grr. I'm still bitter about "su" doing that on some systems without
> "su -". I think I've lost that battle though and I'm forever doomed to
> never know what "su" will do on a new system.

As previously discussed, this would work via something like SET ROLE ...
WITH DEFAULTS, rather than with just SET ROLE. We don't want to break
backwards compatibility.

--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2009-05-31 20:13:26 Re: pg_migrator and an 8.3-compatible tsvector data type
Previous Message Andrew Chernow 2009-05-31 20:08:54 Re: check for missing tablespaces?