SCHEMA support (was Re: DROP TABLE inside a transaction block)

From: "Ross J(dot) Reedstrom" <reedstrm(at)wallace(dot)ece(dot)rice(dot)edu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Mike Mascari <mascarm(at)mascari(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: SCHEMA support (was Re: DROP TABLE inside a transaction block)
Date: 2000-03-07 19:25:23
Message-ID: 20000307132523.C21828@rice.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 07, 2000 at 02:53:49AM -0500, Tom Lane wrote:
> Mike Mascari <mascarm(at)mascari(dot)com> writes:
> > So, as some sort of compromise, a NOTICE was issued.
>
>
> BTW, we are not *that* far from being able to roll back a DROP TABLE.
> The only thing that's really needed is for everyone to take a deep
> breath and let go of the notion that table files ought to be named
> after the tables. If we named table files after the OIDs of their
> tables, then rollback-able DROP or RENAME TABLE would be pretty
> straightforward. If you don't recall why this is, consult the
> pghackers archives...

Another data point regarding table filenames: I've been looking into
what's needed to support SQL92 schemas.

The standard defines a hierarchy of what are essentially scopes for
database objects. They are (in order from high to low):

Cluster of catalogs -> catalog -> schema -> (DB objects)

"Cluster of catalogs" is defined as (section 4.13):

Exactly one cluster is associated with an SQL-session and it defines
the totality of the SQL-data that is available to that SQL-session.

A catalog is (section 4.12):

Catalogs are named collections of schemas in an SQL-environment. An
SQL-environment contains zero or more catalogs. A catalog con-
tains one or more schemas, but always contains a schema named
INFORMATION_SCHEMA that contains the views and domains of the
Information Schema.

catalog and schema names show up in the syntax, e.g. the BNF for table
names (section 5.4):

<table name> ::=
<qualified name>
<qualified name> ::=
[ <schema name> <period> ] <qualified identifier>

<qualified identifier> ::= <identifier> <schema name> ::=
[ <catalog name> <period> ] <unqualified schema name>

Which collapses to (using unique names for the various identifiers):

<table name> ::= [ <catalog indentifier> <period> ] [ <schema indentifier>
<period> ]
<table identifier>

and make a fully qualified column name BNF:

[ <catalog identifier> <period> ] [ <schema identifier> <period> ]
[ <table identifier> <period> ] <column identifier>

so:
foo.bar.baz.bongo

is a well formed column identifier for column bongo of table baz in
schema bar in catalog foo.

What's all this mean for pgsql? Well, SCHEMA are an Entry SQL
requirement. So, the syntax: schema.table needs to be supported. Both
schema and catalog define persistent visibilty scopes, and we need to
support identical table names in multiple schema.

I see two possiblities:

1) Map a pgsql database to a SQL schema.

Since we need to support identical table names in multiple schema,
it might be tempting to map a pgsql database to a schema. In fact,
since Entry SQL requires the syntax:

CREATE SCHEMA <schema authorization identifier>

And, in practice, the SCHEMA name seems to be equal to the database user
name, the pgsql default of creating (and accessing) a DB matching the
username implies this mapping.

However, that means we need to solve the one backend accessing multiple
DBs problem. I have a feeling that there may be 'gotchas' in the current
backend code that presume that all the tuples are coming from one DB.

2) Map pgsql DB -> SQL catalog

If we do this, the multiDB access problem can be pushed down the road,
since cross catalog access (<catalog name> in identifiers) is not
even required by Intermediate SQL, only Full SQL. In addition, almost
everything about catalogs is 'implemetation defined' so we get to claim
them as done. ;-)

2a) However, if a single pgsql database is a catalog, then each DB needs
to be able to contain tables in multiple schema, potentially with the
identical table names. One solution would be to do what we do for DBs:
create seperate subdirs for each schema, and put the table files in there.
Changes are probably isolated to the storage manager code, but I haven't
looked in detail.

2b) Another possiblity is what Tom has suggested, to solve the DDL
statements in a transaction problem: use some other unique identifier
for table filenames, perhaps based on OID. Then, supporting schemas
means supporting the syntax in the parser, and that's it, I think. This
would seem to minimize the changes needed to implement this Entry SQL92
requirement.

So, what do y'all think?

Ross
--
Ross J. Reedstrom, Ph.D., <reedstrm(at)rice(dot)edu>
NSBRI Research Scientist/Programmer
Computer and Information Technology Institute
Rice University, 6100 S. Main St., Houston, TX 77005

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2000-03-07 19:29:32 Re: [HACKERS] alter_table.sql
Previous Message Michael Meskes 2000-03-07 19:25:10 Re: [HACKERS] library policy question