On 11/26/2005 12:32 PM, bostic(at)sleepycat(dot)com wrote:
>> First BDB is not a viable replacement for InnoDB for two reasons both
>> of which stem from BDB architectural considerations (it simply wasn't
>> designed to function well as a backend for a high concurrency RDBMS).
>> Basically, while InnoDB uses MVCC, BDB uses page locks. BDB therefore
>> has locking issues because you don't have the snapshot capabilities that
>> MVCC gets you with InnoDB, and it is unlikely that one will ever be able
>> to provide multiple transaction levels with the BDB storage engine.
> I don't agree.
> While Berkeley DB was not designed as an RDBMS backend, it's not
> that far from where BDB is now to being a RDBMS backend: the
> significant missing pieces might be MVCC, foreign key support
> and moving from page-level to row-level locking.
> Berkeley DB has had multiple transaction levels for a long time.
> I don't believe MVCC is that hard. I think foreign key support
> is more cleanly done above the backend engine -- MySQL used
> InnoDB's support for foreign keys because it was there, not
> because it's the right place to do it.
How hard MVCC is depends on where you start from. In the Postgres case,
where we already had a non-overwriting storage manager that kept old row
versions around. All that needed to be done was to figure out which of
the versions is actually the visible one and teach vacuum to keep those
that could still be seen by someone.
I think that BDB is an overwriting storage engine (I could be wrong). In
which case MVCC is quite a bit more hairy than what we needed. And you
definitely need MVCC for any transaction isolation above read committed,
because otherwise you will have shared read locks preventing updates and
your performance in a concurrent environment just goes down the drain.
I wholeheartedly agree that using a storage engine based foreign key
solution like what InnoDB offered was wrong to begin with and should be
reimplemented in the upper levels anyway. So losing that feature isn't
actually what I consider bad. What they have now has neither DEFERRABLE
nor ON DELETE SET DEFAULT. Not sure if they inteded to fix that in the
next version. Especially while other areas of features still need a lot
of attention. I was shocked to learn that functions and triggers cannot
access any tables. So all a trigger can do is check/modify the values at
hand. No table lookups, no audit functionality, nada. When I created
PL/Tcl and PL/pgSQL it didn't even cross my mind as a possibility to
release any procedural language without access to the DB ... and that
was several years ago!
All the above together plus rolling out a complete new release within
the next 6-9 months sounds challenging, to say the least. And to be
honest, cranking out a completely new storage engine from scratch in
that timeframe is unrealistic. The really bad part here is that the
decision what to do must be made right now, because otherwise, the time
for their renewal talks with Oracle is up and they don't even have an
alternative in sight. So they have to make a decision that will cost a
lot of money and will pull away developers from other, important work.
Plus Oracle has the same time that MySQL AB must spend on catching up
with todays InnoDB - for maybe improving it significantly?
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck(at)Yahoo(dot)com #
In response to
pgsql-advocacy by date
|Next:||From: Robert Treat||Date: 2005-11-30 03:58:34|
|Subject: Re: [pgsql-advocacy] Please let us know if you will come to the PostgreSQL Anniversary|
|Previous:||From: Bruce Momjian||Date: 2005-11-30 03:08:52|
|Subject: Re: joint booths at upcoming tradeshows|