So here is a very rough draft. I would be interested in feedback as to
inaccuracies or omissions. I would like to get the technical side right
before going into an editorial phase.
Any feedback on the technical side?
How is PostgreSQL "Object-Relational?"
The term Object-Relational has been applied to databases which attempt to
bridge the relational and object-oriented worlds with varying degrees of
success. Bridging this gap is typically seen as desirable because
object-oriented and relational models are very different paradigms and
programmers often do not want to switch between them. There are, however,
fundamental differences that make this a very hard thing to do well. The
best way to think of PostgreSQL in this way is as a relational database
management system with some object-oriented features.
PostgreSQL is a development platform in a box. It supports stored
procedures written in entirely procedural languages like PL/PGSQL or Perl
without loaded modules, and more object-oriented languages like Python or
Java, often through third party modules. To be sure you can't write a
graphical interface inside PostgreSQL, and it would not be a good idea to
write additional network servers, such as web servers, directly inside the
database. However the environment allows you to create sophisticated
interfaces for managing and transforming your data. Because it is a
platform in a box the various components need to be understood as different
and yet interoperable. In fact the primary concerns of object-oriented
programming are all supported by PostgreSQL, but this is done in a way that
is almost, but not quite, entirely unlike traditional object oriented
programming. For this reason the "object-relational" label tends to be a
frequent source of confusion.
Data storage in PostgreSQL is entirely relational, although this can be
degraded using types which are not atomic, such as arrays, XML, JSON, and
hstore. Before delving into object-oriented approaches, it is important to
master the relational model of databases. For the novice, this section is
therefore entirely informational. For the advanced developer, however, it
is hoped that it will prove inspirational.
In object-oriented terms, very relation is a class, but not every class is
a relation. Operations are performed on sets of objects (an object being a
row), and new row structures can be created ad-hoc. PostgreSQL is,
however, a strictly typed environment and so in many cases, polymorphism
requires some work.
Data Abstraction and Encapsulation in PostgreSQL
The relational model itself provides some tools for data abstraction and
encapsulation, and these features are taken to quite some length in
PostgreSQL. Taken together these are very powerful tools and allow for
things like calculated fields to be simulated in relations and even indexed
for high performance.
Views are the primary tool here. With views, you can create an API for
your data which is abstracted from the physical storage. Using the rules
system, you can redirect inserts, updates, and deletes from the view into
underlying relations, preferably using user defined functions. Being
relations, views are also classes.
A second important tool here is the ability to define what appear to be
calculated fields using stored procedures. If I create a table called
"employee" with three fields (first_name, middle_name, last_name) among
others, and create a function called "name" which accepts a single employee
argument and concatenates these together as "last_name, first_name
middle_name" then if I submit a query which says:
select e.name from employee e;
it will transform this into:
select name(e) from employee e;
This gives you a way to do calculated fields in PostgreSQL without
resorting to views. Note that these can be done on views as well because
views are relations. These are not real fields though. Without the
relation reference, it will not do the transformation (so SELECT name from
employee will not have the same effect).
Messaging and Class API's in PostgreSQL
A relation is a class. The class is accessed using SQL which defines a new
data structure in its output. This data structure unless defined elsewhere
in a relation or a complex type cannot have methods attached to it and
therefore can not be used with the class.method syntax described above.
There are exceptions to this rule, of course, but they are beyond the
scope of this introduction. In general it is safest to assume that the
output of one query, particularly one with named output fields, cannot
safely be used as the input to another.
A second messaging aparatus in PostgreSQL is the LISTEN/NOTIFY framework
which can be used along with triggers to issue notifications to other
processes when a transaction commits. This approach allows you to create
queue tables, use triggers to move data into these tables (creating
'objects' in the process) and then issuing a notification to another
process when the data commits and becomes visible. This allows for very
complex and and interactive environments to be built from modular pieces.
Polymorphism in PostgreSQL
PostgreSQL is very extensible in terms of all sorts of aspects of the
database. Not only can types be created and defined, but also operators
can be defined or overloaded.
A more important polymorphism feature is the ability to cast one data type
as another. Casts can be implicit or explicit. Implicit casts, which have
largely been removed from many areas of PostgreSQL, allow for PostgreSQL to
cast data types when necessary to find functions or operators that are
applicable. Implicit casting can be dangerous because it can lead to
unexpected behavior because minor errors can lead to unexpected results.
'2012-05-31' is not 2012-05-31. The latter is an integer expression that
reduces to 1976. If you create an implicit cast that turns an integer into
a date being the first of the year, the lack of quoting will insert
incorrect dates into your database without raising an error ('1976-01-01'
instead of the intended '2012-05-31'). Implicit casts can still have some
Inheritance in PostgreSQL
In PostgreSQL tables can inherit from other tables. Their methods are
inherited but their castes are not, nor are their indexes. This allows you
develop object inheritance hierarchies in PostgreSQL. Multiple inheritance
Table inheritance is an advanced concept and has many gotchas. Please
refer to the proper sections of the manual for more on this topic. On the
whole it is probably best to work with table inheritance first in areas
where it is more typically used, such as table partitioning, and later look
at it in terms of object-relational capabilities.
Overall the best way to look at PostgreSQL as an object-relational database
is a database which provides very good relational capabilities plus some
advanced features that allows one do create object-relational systems on
top of it. These systems can then move freely between object-oriented and
relational worldviews but are still more relational than object-oriented.
At any rate they bear little resemblance to object-oriented programming
environments today. With PostgreSQL this is very much a toolkit approach
for object-relational databases building on a solid relational foundation.
This means that these are advanced functions which are powerful in the
hands of experienced architects, but may be skipped over at first.
In response to
pgsql-docs by date
|Next:||From: Bruce Momjian||Date: 2012-08-15 15:47:41|
|Subject: Re: Foreign server version and type|
|Previous:||From: Tom Lane||Date: 2012-08-15 04:21:50|
|Subject: Re: Would like to contribute a section to docs for 9.3. Where to start?|