OSDN Database conference report (long)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-general(at)postgreSQL(dot)org, pgsql-hackers(at)postgreSQL(dot)org
Subject: OSDN Database conference report (long)
Date: 2000-11-03 04:47:10
Message-ID: 6072.973226830@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On Oct. 30 and 31 I attended OSDN's rather grandiosely named "Open Source
Database Summit" (despite what you might infer from the name, it was just
a small, open-to-the-public conference). Their info about the conference
is at http://www.osdn.com/conf/osd/conf_index.shtml, though I'm not sure
how long that page will remain up. OSDN invited a number of the principal
suspects from each major open-source database project to speak, and paid
for airfare and hotel rooms for the speakers. The invited speakers were
Bruce Momjian and myself from Postgres, David Axmark and Monty Widenius
from MySQL, Mike Olson and Mike Ubell from Sleepycat (Berkeley DB), and
Ann Harrison from InterBa^H^H^H^H IBPhoenix; also Britt Johnston, who as
CTO of NuSphere can fairly be ranked in the MySQL camp; plus Tim Perdue
and Rob Ferber as representative application-builders. Total attendance
was about forty or fifty, so we had a pretty good crowd of interested
people. Ned Lilly of Great Bridge was also there (on GB's dime), as well
as two or three more NuSphere people, but mostly it seemed to be users
and potential users of open-source databases.

Sunday evening, Bruce and Ned and I filtered in at different times.
OSDN had laid out a spread of free food in one of the meeting rooms,
but the hotel staff didn't tell any of us about it, so we ended up
hanging out in the hotel bar with a number of similarly ill-informed
souls. It was particularly interesting to talk to John Scott, who is
working for Verizon Wireless on redoing the software for their nationwide
paging service. It turns out they are looking at using Postgres for the
customer database, and either Postgres or Berkeley DB for the realtime
database that handles paging messages being pumped through the system.
That'll be a feather in our caps if it happens!

Monday, Britt Johnston opened the formal proceedings with what amounted
to a pep talk for OS DB work. I have a fairly interesting table in my
notes, giving current total Web-search hits for various databases:
Oracle 3.0 mil
MySQL 2.3 mil
Postgres 0.7 mil
SQL Server 0.6 mil
IBM DB2 0.5 mil
Interbase 0.1 mil
(I had to copy the last half of the table from memory, so it may not be
exactly what he said, but it's close.) This says that MySQL+PG together
are *already* as interesting as Oracle for web work. He also announced
that NuSphere would be financing substantial work on MySQL --- I have a
note about 10000 concurrent transactions on a single server, which'd be
pretty impressive (he didn't say what size server, though).

The conference format alternated between group-wide sessions and pairs
of concurrent workshop talks, so after that we split into two groups.
I went to hear Tim Perdue talk, while Bruce and Ned listened to Mike
Ubell; they'll have to report on what Mike said. Tim's discussion was
about building apps atop PHP and a database. He pointed out that for
most website builders, the path of least resistance given their existing
skills is to construct an "application heavy" system in which most of
the logic is in application code. He contrasted this with "database
heavy" design, in which more reliance is put on database functionality,
such as constraints, triggers, views, etc. Unfortunately (from our
point of view) Postgres excels for the database-heavy style, whereas
MySQL's lean feature set is sufficient or at least self-reinforcing
for the application-heavy style. It'll be difficult for PG to achieve
world domination until Web developers become more database-savvy ;-).
Tim encouraged a great deal of comment from the audience, and went so far
as to make everyone introduce themselves first. (One interesting thing
that emerged at that point was that there were *very* few MySQL users,
and no MySQL developers, at this talk --- though I guess that just meant
that the MySQL people all wanted to hear what Mike Ubell had to say,
since he was talking about a directly-MySQL-related subject.) One of the
longest-running parts of the discussion had to do with giving good error
messages and how it is hard to get friendly messages when you rely on the
database to do error checking. I thought this pointed up the need we've
been aware of for awhile to overhaul our error reporting. Tim also had
a "wish list" for PG that included better admin tools, such as a way to
see exactly what queries are running; and a way to retrieve all the
database-generated items in a just-inserted row, not only the OID.
Both of these have also been on the radar screen for awhile.

After a fine lunch (all the food was superb BTW; OSDN made an excellent
choice of hotel), we reconvened to hear David Axmark talk about the
history and philosophy of MySQL. The only thing that really surprised me
is that that project is quite young: it started in 1995. Given that Monty
seems to do the vast majority of the development work, there are not many
man-years in it, certainly far less than in Postgres. They've done well
to come as far as they have.

The subsequent breakout was between Rob Ferber talking about shedding
database processing load to stateless clients, and me talking about
Postgres' transaction model. I was quite annoyed that I couldn't go
hear Rob, because his talk abstract sounded very interesting :-(.
You can find the slides from my talks (also Bruce's) at
http://www.postgresql.org/osdn/index.html, so I won't go into detail,
but I hope Bruce will report on Rob's talk.

That evening there was a cocktail hour in the hotel's library (free
booze, courtesy of the conference) followed by dinner at the hotel's
better restaurant. I spent a good part of the cocktail hour talking
with Ann Harrison and several other people about organizing some sort
of open-source database benchmarking project. It turns out that DEC's
(now Compaq's) performance measurement group has a nearly-done reference
implementation of AS3AP, which they're thinking of releasing as an open
source project. Everyone agreed that would be a fine starting point.
We also got to hear Ann's version of the InterBase situation --- more
about that later. Towards the end of the hour I wandered over and started
to talk to Monty and David. That stretched into eating dinner with them.
Since I'd had a couple glasses of wine already, and a couple more during
dinner, while they'd started with vodka and then joined in on the wine,
I doubt that either side could repeat much of the conversation word-for-
word ;-). But it was all pleasant and perhaps will serve to dispel some
of the bad blood that's existed between the two projects for awhile.

The next morning, the opening speaker was me, with a presentation on the
internals of Postgres (see slides at above URL). The subsequent breakout
had Bruce giving a talk on the history and project-management practices
of Postgres (see slides), while I went to hear Ann Harrison talk about
integrity checking in databases. Before she could get into her promised
topic, the audience pretty much forced her to give a rundown on the
InterBase situation. Bottom line: it's a mess. She feels Borland were
being unreasonable (and in her telling of it, they indeed seem to be)
while they felt, or said they felt, that she was. She thinks that once
she and others had come up with a business plan for doing something with
InterBase rather than dropping it, Borland/Inprise decided they could
execute the business plan without her --- and that may be pretty accurate.
Anyway, Inprise now has a small in-house development team with few if any
original developers, Ann has only jawbone control over a dozen or so
open-source developers (these also with little or no deep knowledge of the
source, apparently), and there's a code fork between the Inprise version
and the "Firebird" open-source project. The two groups are apparently
talking enough to try to keep their trees from diverging too much, in the
hopes that the fork might be reunited someday, but Ann didn't sound all
that hopeful about it. Things sound mighty bleak to me --- but perhaps
InterBase is just going through a transition comparable to Postgres'
transition from a Berkeley project to an open project.

To get back to the technical part of Ann's talk, the thing I came away
with is a realization that IB did a lot of things pretty similar to
Postgres. In particular, it sounds like they have a multi-versioning
model nearly identical to Postgres'. They also have some ideas we might
be able to adopt --- for example, their indexes point only to the newest
version of a row, not all versions. It'd be worth our while to dig
through their code for ideas. However, Ann admitted that they are
woefully short on internals documentation, so extracting useful ideas
promises to be painful :-(

The final group-wide session featured Mike Olson of Sleepycat as speaker.
Most of you know that Mike was part of the Berkeley Postgres team years
ago (if you don't, try scanning our sources for the initials mao) so I
count him still a Postgres man, even though Sleepycat is currently in bed
with MySQL. Mike had some *extremely* interesting things to say about
the prospects for open-source databases making inroads against commercial
competition. He pointed out that the notion that we have any chance of
doing so is mostly founded on the success Linux has been having competing
with Windows --- but that success is founded on (a) a cost advantage,
(b) a reliability advantage, and (c) an advantage in the applications
space: Linux runs sendmail, bind, Apache, and all the other core Internet
server apps, whereas Windows doesn't run them especially well. Mike
pointed out that Oracle could *easily* afford to give away their software
for free and make all their money on support contracts (license fees are
already only 1/3rd of their revenue, so it wouldn't be that big a
switch). That would make the cost advantage a harder sell. We could
still make a good case for open databases on total cost of ownership,
but a key ball to keep our eye on is the ease of installation and
administration of our servers. Much of the differential comes from the
fact that qualified Oracle DBAs are scarce and obscenely well-paid.
We have to be sure that Joe Average Unix Sysadmin can deal with our
servers without much trouble. As for point (b), the news is bad:
we are *not* up to Oracle standards on reliability. (Mike only said
that it's unproven that we are up to commercial standards, but from
here in the trenches I'd say we ain't.) We need to keep our noses to
the grindstone on this issue, and even so it's unlikely that we'll ever
have the same sort of obvious reliability advantage that Linux has over
Windows, simply because the commercial databases aren't anywhere near
as bad as Windows. That leaves point (c) --- we have to exploit the
open-source nature of our systems to encourage a flowering of compatible
applications. And we'd better make sure that people can make money
building apps atop open-source databases, or that flowering won't happen.
A thought-provoking talk indeed; probably the best one at the conference,
IMHO.

The final pair of speakers were Monty on the history and
project-management practices of MySQL, and Rob Ferber on Open Sales'
^H^H^H^H Zelerate's way of building distributed transaction processing.
Bruce went to hear Monty, I went to hear Rob. It was pretty interesting:
basically, they do not try to replicate state, but instead distribute
"events" --- maybe better called "actions", since the events are things
like "decrement available-stock by 1". Each server in their network is
"authoritative" for events that it originates, and is responsible for
transmitting those events to other servers. Each server maintains state
tables that represent the integral of all the events it knows of so far,
but it's explicitly recognized that these state tables may be out of sync
due to network latency, communication failures, etc. With appropriate
application programming it's possible to build a highly robust distributed
system, sitting atop non-distributed database servers. Their system is
open source and all coded in Perl, so you can go have a look if you want
to learn more.

Bruce and John Scott and I wasted most of Tuesday evening in a fruitless
search for the Computer Literacy bookstore that used to exist near Apple
headquarters, so I can't say if anything interesting happened around the
hotel then. But it seemed that things were winding down and a lot of
people were departing that evening, so probably not...

Overall it was a very interesting and worthwhile conference. I have to
congratulate Mark Stone and Christine Dzierzeski of OSDN on organizing
a great conference on little time and minimal budget. If they invite
me to the next one, I'll be there.

regards, tom lane

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Rob S. 2000-11-03 06:29:50 RE: [HACKERS] OSDN Database conference report (long)
Previous Message The Hermit Hacker 2000-11-03 04:23:51 Re: the List! the List

Browse pgsql-hackers by date

  From Date Subject
Next Message Rob S. 2000-11-03 06:29:50 RE: [HACKERS] OSDN Database conference report (long)
Previous Message Bruce Momjian 2000-11-03 03:44:44 7.0.3 branded