Re: Relational Inheritance Features.

From: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
To: Trent Shipley <tshipley(at)deru(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Relational Inheritance Features.
Date: 2006-01-09 23:40:14
Message-ID: 20060109234014.GB3902@pervasive.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Fri, Jan 06, 2006 at 10:31:49AM -0700, Trent Shipley wrote:
> Actually, lack of DEFINE is a major gap in SQL itself. Most modern
> programming languages make a distinction between *defining* a ???multi-variable
> precursor??? (sort of a light weight type) and *instantiating* a variable. In
> SQL, definition is always conflated with instantiation using a CREATE
> expression. It should be possible to DEFINE database objects and then CREATE
> the objects; create them after an indefinite interval, and perhaps
> repeatedly. One should not have to repeatedly run verbose scripts.

Actually, you can do this in any database that allows for the definition
of compound/complex types.

I'm not sure I buy the 'verbose script' argument either... generally you
run those scripts to create all the definitions, and you re-run them
when those definitions change.

> Even more than completing than use in some still theoretical relational
> inheritance model, DEFINE would be of immediate practical use:
>
> CREATE temporary tables, that often occur as clones and that are repeatedly
> created and dropped.

This would be handy, and is how Oracle actually handles temp tables; a
DBA creates the temp table definition, and that table is just magically
there when a new session connects. From a DDL standpoint, it's very
similar to a regular table.

> CREATE views that are often dropped then recreated.
> Creating indexes that are repeatedly dropped.
> Managing frequently dropped constraints.

Not sure how much demand there is for these things...

> If one could define a free-floating or ???immaculate??? index, it could be
> repeatedly instantiated and attached to columns in different tables.

Interesting, even if only from an initial DDL standpoint.

> If one could define an immaculate constraint, it could be repeatedly created
> and attached to tables in a schema.

You can already do part of this with domains. If you wanted to get
trickier, I believe you could do it with complex types as well.

> 1.1 Some operations on relational classes.
>
> With fully developed relational inheritance it would be highly desirable to
> re-order columns in descendant tables. The first reason is a strong
> aesthetic argument. One might, for example, always want some audit columns
> to be the last five values in any table in a schema. A practical argument is
> that one could create distinct class signatures by arbitrarily recombining
> the order of columns. Naturally, it would be convenient, if one could also
> refer to columns by name in procedural languages.

I agree, but I think it's more important for us to be able to divorce
the logical layout of a table from the on-disk layout.

> Obviously, one expects that descendant tables inherit the properties of their
> parent class. These properties include the columns of the ancestor table,
> the column order of the ancestor table, ancestor constraints, indexes, and
> event actions. One currently would override a constraint with ALTER
> TABLE ... DROP CONSTRAINT. Then, if desired, another constraint could be
> added. Though effective, using ALTER TABLE to override table properties can
> be a bit difficult for a human code reader to scan. Readability might be
> improved if table columns could be declared as explicitly PLURAL or
> explicitly UNCONSTRAINED.

Example?

> The most important point made in this essay is that some operations would be
> *very* useful indeed for database design in general even if there were no
> provision for relational inheritance. First, relational databases should
> separate definition from instantiation. In particular Postgresql should
> adopt a ???DEFINE object definition_name / CREATE object_name FROM
> definition_name??? feature in advance of its competitors. The ability to
> define objects will come in very handy for temporary tables, creating

Agree that this would be very usefull for temporary objects.

> variations on a table theme, and so on. It will also eliminate the need to
> repeatedly run verbose SQL scripts.

I don't see how this will reduce the need for 'verbose SQL scripts'.
Make them less verbose, maybe, but you still have to create all the
objects in a database somehow.

> The second insight of general interest is that it should be possible to
> recycle constraint and index declarations. Furthermore, it should be
> possible to create immaculate definitions of constraints and indexes that are
> not attached to any particular table or view. Immaculate properties would
> eliminate the need to repeatedly script nearly identical constraints and
> indexes across large numbers of relations in a schema.

This certainly does seem like an interesting idea.

> Even more important is the question of what is to be gained by relational
> inheritance. I have run across design problems where relational inheritance
> with sub-classing primary keys would have been *very* convenient. One
> imagines, however, that from the perspective of a database designer trained
> in relational design, the hierarchical network dimension of relational
> inheritance will be ???abused???. (Of course, ???abuse??? would be a matter of
> perspective.)

I think some examples of how your ideas would help *common* design
problems would be very valuable.

> Not having much personal experience, my arm-chair expectation for current
> practice is that OO application and relational database schemata are
> developed in separately. Moving data from the application to the database
> thus requires a translation layer. (The translation layer usually being _ad
> hoc_ and distributed throughout the application rather than being a discrete
> module.) Report writers, however, can directly query the ???well designed???
> relational schema.

I don't necessarily agree that you want to tightly couple database
schema with programatic data structures. Often programs need
representations of data that are optimized for the task that program
performs, but would be sub-optimal when it comes to the database.

Translation layers spread throughout code is bad coding, plain and
simple. Coders who do that are probably no more likely to use advanced
databases features than to write clean code to begin with.

> With the object oriented potential of relational inheritance, I would expect
> to see a development pattern where an object oriented program is translated
> into an object oriented schema. One could then build a translation layer of
> views allowing indirect query of the underlying data, but what one really
> expects is that maintenance database developers and report writers will have
> to directly query an object oriented network database using SQL on an _ad
> hoc_ basis.
>
> My own strong suspicion is that with relational inheritance many object
> oriented programmers will design their own schemata. Or, in what will amount
> to the same thing, schemata will be developed based directly on object
> oriented analyses. Another likely scenario is that when a project gets
> behind, relational schemata get thrown out to speed application development.
>
> The net effect will be to simply move complexity downstream rather than
> dealing with it early on. Object oriented programmers will have a trivial
> problem putting data into databases slavishly parallel to object oriented
> data structures. At the other end of the development food chain, report
> writers will have a devil of a time writing SQL to extract data from what
> will have effectively become network databases.

As you mention, simply mirroring OO data structures into a database is
seldom a good idea.

What I've done in the past is worked with a higher-level language that
was responsible for building both C accessor functions/classes (though
it could certainly have easily built other languages as well) as well as
database objects. This did a great job of cleaning up the application
code and making sure that things stayed in sync. It also was flexable
enough that objects in the database could be defined without any direct
access methods, so that all access would instead go through a more
controlled means (such as a stored procedure).

Had we had the time to do so, the next step would have been to build a
higher-level language on top of that, that allowed for actual
object-level design.

In any case, if you hope to get all of your ideas looked at by the
community, I think you'll need to come up with some examples that show
the power of what you're proposing. Some of your ideas are easy to
understand (such as being able to define different objects and then
re-use that definition), but I think a lot of what you're proposing is
too difficult to understand without an example. (At least it's over my
head! :)

Also, keep in mind that what you're proposing appaers to be a heck of a
lot of work. The odds of it ever being tackled in one piece are
basically 0. Breaking this into smaller pieces will have to be done.
--
Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Bruce Momjian 2006-01-09 23:46:03 Re: [ANNOUNCE] CRITICAL RELEASE: Minor Releases to Fix DoS
Previous Message Bruce Momjian 2006-01-09 23:38:40 Re: autocommit to off