A post-7.1 wish-list.

From: Emmanuel Charpentier <charpent(at)bacbuc(dot)dydndns(dot)org>
To: pgsql-hackers(at)postgresql(dot)org
Subject: A post-7.1 wish-list.
Date: 2001-01-07 10:31:09
Message-ID: 3A58456D.2C9E9201@bacbuc.dydndns.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dear list,

According to this list's content, the upcoming 7.1 release appears to be
in good progress. This version will bring a *lot* on necessary features
for some database work : unions and subselects in views, and the
long-awaited outer joins. In other words, while 7.0 was a large step in
*performance* terms, 7.1 will be a huge advance in *competence*.

These advances will allow me to use PostgreSQL for some work I had to do
until now with (gasp !) MS-Access (which has poor performance but good
competence). And get rid of the damn MS-Windows envoronment for good !

This leads me to express two whishes for future PotgreSQL developments.
These ideas are inpired by my daily work and might or might not be of
great usefulness for other uses.

My daily work (biostatistics) involves managing a lot of small but
complex databases : those are mainly medical records, created for a
study's purposes, wich have a low volume (a dozen or two of tables
having some dozens to some thousands rows) but might have a deeply
nested and irregular structure (not all patients recorded need to have
records of all histories and procedures involved). As a consequence, I
am much more interested in competence than in performance, and so is my
wishlist. Keep that in mind when reading what follows.

1) Updatable views.
==================

According to the current documentation, views are read.only. This
implies some grunt work when creating update forms for the kind of
low-use applications I have to manage.

I know that computing the "updatability" of a view is not a trivial
problem. Furthermore, even when a view is indeed updatable, the
update/append algorithm is not easy to compute. These problems are even
harder in multi-user mode. And I do not have any idea of the feasibility
of such updates in an OO database, where inheritance concerns will
interfere.

However, such updatable views would greatly simplify the end-user work
for creating and maintaining these records (at least when no inheritance
is involved.

I am not able to state the usefulness of such "updatable views" in more
mainstream applications. I note, however, that most "brand-name" RDBMSes
ofer that.

Your thoughs ?

2) External database or table access.
====================================

Quite often, two or more distinct applications have to use common data.
My favourite example is again medical : two othewise unrelated
applications might have to use a common medical thesaurus.

The obvious solution (including the medical thesaurus tables in each and
every application) leads to awful consistency problems. Working this way
can be properly done only with replication, which is not yet available
in PostgreSQL. Furthermore, most applications will use only one or two
views of the thesaurus, while the thesaurus might be both large and
complex.

Another "obvious solution" (delegating the use of the thesaurus to the
client application) is also a non-solution : how do you join your data
and the thesaurus data ?

The ability to "attach" (MS-Access parlance) a table or a view from
another database is quite helpful. And I think that it has a lot of
applications outside my (quite limited) realm.

For example, two different departments of the same company might have
needs for two different management applications, while having to
use/update the same company-wide accounting records. I don't se the
"replication" solution as a good one (data duplication should be
considered harmful in any circumstances).

This could be implemented in different ways. From the easiest to the
hardest :

- Attachment of Postgres databases running on the same server :
relatively easy. The data structures (internal representation) are
known, there is a guarantee of consistency in user identification,
security information is also consistent.

- Attachment of Postgres databases running on another server. There,
while the data structures are known, the user and security informations
can be inconsistent and have to be managed "by hand".

- Attachment of other databases. Ouch : this one is hard. One have to
rely on the information made available by the other database server. And
there lies a problem : there is no universal standard for this.

... or there is ? Two bridges come to mind. Using ODBC or JDBC, provided
the "other" RDBMS has that, allows to use some standard information : at
the very minimum, table names, attribute names and type, and
updatability. In most cases, you will also be able to know whether
indices are available for such and such columns.

This minimal set of information allows you to use these external tables
in your own joins. And, provided that you have update rights, the
ability to use them as "native" tables.

Of course, the use of these bridges involve some (maybe quite serious)
performance loss. But then again, I'm less interested in performance
than in competence ...

What do you think ?

Emmanuel Charpentier
--
Emmanuel Charpentier

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message The Hermit Hacker 2001-01-07 11:23:16 Re: Suggested fix for pg_dump
Previous Message Denis Perchine 2001-01-07 09:39:04 Quite strange crash