Re: [HACKERS] Proposed Changes to PostgreSQL

From: Chris Bitmead <chrisb(at)nimrod(dot)itg(dot)telstra(dot)com(dot)au>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Chris Bitmead <chris(at)bitmead(dot)com>, "pgsql-hackers(at)postgreSQL(dot)org" <pgsql-hackers(at)postgreSQL(dot)org>, pgsql-sql(at)postgreSQL(dot)org, pgsql-general(at)postgreSQL(dot)org
Subject: Re: [HACKERS] Proposed Changes to PostgreSQL
Date: 2000-02-03 23:03:14
Message-ID: 389A0932.6D4C0CC@nimrod.itg.telecom.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers pgsql-sql

Peter Eisentraut wrote:

> A lot of people use inheritance to create "consistent schemas", that is,
> they empty create base tables, such as "address" which are inherited by
> tables such as customer, vendor, office, etc.

This is a really bad idea. You could never have both a postal address
AND
a home address for example. I thought the original postgres supported
this
by having
CREATE TABLE ADDRESS (...)
CREATE TABLE PERSON(add ADDRESS).

Anyway, this is what Oracle and others can do these days, and this is
the right
thing.

> Anyway, an idea I had would be to reimplement inheritance based on joins,
> since this is what the "pure relational" solution would be anyway. When I
> create a table B that is based on A, all the system does is create the
> table B as usual and store a note "I inherit from A". Any row you insert
> into B also creates a row in A, and the row in B contains an oid pointer
> to it.

This is a really stu^H^H^H bad idea. I have hierarchies 5 levels deep
with
multiple inheritance, and I
don't want to do a 10 way join just to retrieve an object.

This is why RDBMS's performance sucks so incredibly badly on some
applications.
an ODBMS can perform 100x as fast in these cases just because of what
you
are proposing.

> Thus a select on B performs a join on A.oid and B.row_in_A_pointer.
> A select on A just returns all the rows in A, no extras needed. A delete
> on B deletes the row in B and in A. A delete in A would cascade to B. Both
> of this can be gotten for free with foreign keys. Adding a column to A
> just adds the column to A, all other tables get the new column magically
> and in the right order. Same with dropping columns, etc.
>
> In short, this approach solves all inheritance problems at once and does
> so without adding any extra kludges besides the "I inherited from" field,
> which is static, plus the necessary transformations necessary in the
> parser. The drawback is of course that a select from an inherited table
> would always incur a join, perhaps some optimizing could be done in this
> direction. But the bottom line is that the compatibility issue looms big.
>
> -Peter
>
> On Thu, 3 Feb 2000, Chris Bitmead wrote:
>
> > Hi,
> >
> > I've been spending a lot of time lately with gdb and tracing the
> > back-end seeing if I can understand it enough to make some changes.
> > I'm starting to actually understand a lot of stuff, so in order
> > to have some possibility of having my changes accepted, I want to
> > discuss
> > them here first. Based on that, I'm going to hopefully make an attempt
> > at implementation. I have a patch for one of these changes already
> > if I get the go ahead.
> >
> > THESE CHANGES DON'T AFFECT YOU IF YOU DON'T USE INHERITANCE.
> >
> > Speak now about these changes or please, forever hold your peace. Of
> > course you can comment later if I screw up implementation.
> >
> > The proposed changes are....
> >
> > 1) An imaginary field in every tuple that tells you the class it came
> > from.
> > This is useful when you select from table* and want to know which
> > relation the object actually came from. It wouldn't be stored on disk,
> > and like oid it wouldn't be displayed when you do SELECT *. The field
> > would be called classname. So you could have...
> > SELECT p.classname, p.name FROM person p;
> > person | Fred
> > student | Bill
> > employee | Jim
> > person | Chris
> >
> > If you want to know the exact behaviour it is as if every table in the
> > database had done to it...
> > ALTER TABLE foo ADD COLUMN classname TEXT;
> > UPDATE foo SET classname='foo';
> >
> > Of course this is not how it would be implemented. It is just
> > reference for how it will appear to work. BTW, this idea was also
> > in the original berkeley design notes.
> >
> > 2) Changing the sense of the default for getting inherited tuples.
> > Currently you only get inherited tuples if you specify "tablename*".
> > This would be changed so that you get all sub-class tuples too by
> > default unless you specify "ONLY tablename". There are several
> > rationale for this. Firstly this is what Illustra/Informix have
> > implemented. Secondly, I believe it is more logical from an OO
> > perspective as well as giving a more useful default. If a politician
> > IS a person and I say SELECT * from person, then logically I should
> > see all the politicians because they are people too (so they claim
> > :). Thirdly, there are a whole range of SQL statements that should
> > probably be disallowed without including sub-classes. e.g. an ALTER
> > TABLE ADD COLUMN that does not include sub-classes is almost certainly
> > undesirable. It seems ashame to have to resort to non-standard SQL
> > with the "*" syntax in this case when it is really your only
> > choice. Basicly, wanting ONLY a classname is a far more unusual
> > choice, and leaving off the "*" is a common error. Fourthly, it seems
> > out of character for the SQL language to have this single character
> > operator. The SQL style is to use wordy descriptions of the operators
> > meaning. "ONLY" fits well here because it describes its own meaning
> > perfectly whereas to the unitiated, "*" is harder to guess at. While
> > this change is an incompatibility I hope for those few people using
> > inheritance they can accept the need to move forward without
> > over-burden of backwards compatibility.
> >
> > 3) The ability to return different types of rows from a SELECT. This
> > is to allow implementation of ODBMS functionality where a query could
> > be required to instantiate objects of differing types with differing
> > attributes.
> >
> > I would propose that that anytime you do a SELECT * from a base table
> > that you would get back the full rows from those sub tables. Since the
> > current PQ interface which doesn't support this notion would remain
> > unchanged this wouldn't affect current users.
> >
> > It's probably also desirable to have a syntax for getting just the
> > columns of the base table when this is desired. Say perhaps SELECT %
> > from table. This would be a performance hack for users of libpq and a
> > functionality difference for users of psql.
> >
> > The reason I think the "*" syntax should take on the new functionality
> > is because it would be more consistent with what the OQL (object query
> > language) standard specifies, and also because it seems the more
> > useful default. Also there is no compatibility reason not to do it.
> >
> > In addition it would be legal to specify columns that only exist in
> > sub-classes. For example, if we had
> >
> > CREATE TABLE person (name TEXT);
> > CREATE TABLE student (studentid TEXT, faculty TEXT) INHERITS (person);
> >
> > it would be legal to say...
> > > SELECT * FROM person;
> > NAME
> > ----
> > Fred
> > Bill
> >
> > NAME | STUDENTID | FACULTY
> > --------------------------
> > Jim | 23455 | Science
> > Chris| 45666 | Arts
> >
> > > SELECT *, studentid FROM person;
> > NAME
> > ----
> > Fred
> > Bill
> >
> > NAME | STUDENTID
> > ----------------
> > Jim | 23455
> > Chris| 45666
> >
> > > SELECT *, studentid FROM ONLY person;
> > ERROR: person does not contain studentid.
> >
> > > SELECT % FROM person;
> > NAME
> > ----
> > Fred
> > Bill
> > Jim
> > Chris
> >
> > As you can see, it is desirable that psql be modified to be able to
> > print these differing tuple types. Presumably new column headings will
> > be printed when a tuple is differing to the previous one. Likely it
> > will be often desirable to do a
> > SELECT * FROM person p ORDER BY p.classname;
> > in order to have all the tuples of a particular type grouped together.
> >
> > In addition some extenions will be done to the PQ interface to support
> > these differing return types. The current PQ interface will be left
> > unchanged and backwards compatible for retrieving rows of a single
> > type.
> >
> > Also there should be an settable option that specifies that "*" should
> > also return the normally ignored columns of oid and classname. This is
> > so that OO programs that embed SQL into them also get back the oid and
> > classname which are required for the behind the scenes implementation
> > of an ODMG client. Something like...
> >
> > SET SHOW_OID TRUE;
> > SHOW_CLASSNAME TRUE;
> >
> > SELECT * FROM person;
> >
> > OID CLASSNAME NAME
> > -------------------
> > 2344 person Fred
> > 3445 person Bill
> >
> > OID CLASSNAME NAME | STUDENTID | FACULTY
> > -----------------------------------------
> > 2355 student Jim | 23455 | Science
> > 5655 student Chris| 45666 | Arts
> >
> > ************
> >
> >
>
> --
> Peter Eisentraut Sernanders vaeg 10:115
> peter_e(at)gmx(dot)net 75262 Uppsala
> http://yi.org/peter-e/ Sweden

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Chris Bitmead 2000-02-03 23:27:37 Re: [HACKERS] Re: [SQL] Proposed Changes to PostgreSQL
Previous Message Hannu Krosing 2000-02-03 22:57:45 Re: [HACKERS] Re: [SQL] Proposed Changes to PostgreSQL

Browse pgsql-hackers by date

  From Date Subject
Next Message Chris Bitmead 2000-02-03 23:27:37 Re: [HACKERS] Re: [SQL] Proposed Changes to PostgreSQL
Previous Message Hannu Krosing 2000-02-03 23:00:28 Re: [HACKERS] SERIAL type isn't listed...?

Browse pgsql-sql by date

  From Date Subject
Next Message Chris Bitmead 2000-02-03 23:27:37 Re: [HACKERS] Re: [SQL] Proposed Changes to PostgreSQL
Previous Message Hannu Krosing 2000-02-03 22:57:45 Re: [HACKERS] Re: [SQL] Proposed Changes to PostgreSQL