Re: New feature request: FlashBack Query

From: "Jonah H(dot) Harris" <jonah(dot)harris(at)gmail(dot)com>
To: "Gregory Stark" <stark(at)enterprisedb(dot)com>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, "Chad Wagner" <chad(dot)wagner(at)gmail(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, RPK <rohitprakash123(at)indiatimes(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: New feature request: FlashBack Query
Date: 2007-02-20 15:20:04
Message-ID: 36e682920702200720v6aab76d2w770ab47d061c4587@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2/20/07, Gregory Stark <stark(at)enterprisedb(dot)com> wrote:
> I used to say that too but I've since realized it's not really true.

Heh, take a joke man... I was following up on Drake's email :)

But, since you want to discuss your view of the systems openly... I'll
gladly reply :)

> It's more like Oracle is optimized for data that's committed
> long in the past and we're optimized for data that's
> been recently updated.

Wrong. When Oracle says it's committed, it's committed. No
difference between when, where, and how. In Oracle, the committed
version is *always* the first presented to the user... it takes time
to go back and look at older versions; but why shouldn't that be a bit
slower, it isn't common practice anyway. Same with rollbacks... why
should they optimize for them when 97% of transactions commit?

> In Oracle the data that's been committed long in the past requires no
> transactional overhead but the data that's been recently updated requires lots
> of work to fetch the right version.

Wrong. The same transactional overhead applies to *all* data in
Oracle no matter of when it was committed. Similarly, the only
overhead required occurs when someone is querying in serializable
isolation or on read-committed data before or during a commit. On
short OLTP-type transactions, Oracle has the most optimized solution.

> In Postgres it's the other way around. data that's been committed deleted long
> ago requires extra work to clean up but data that's been recently changed
> requires little additional work to see the correct version.

PostgreSQL has little additional work? Like, checking the validity of
every tuple? Oracle checks visibility at the block level, so there's
*much* less overhead. Take most of the benchmarks which can hold ~200
tuples per block. Tables in those benchmarks are 100+ million rows.
On a sequential scan, Oracle would perform 500K checks, PostgreSQL
would perform *all* 100M checks (not counting dead versions due to row
updates and the like). On an index scan, Oracle not only has a
smaller index and less to check, but also knows the tuple will be
committed and will, in most cases, not have to perform additional
physical I/O to find the latest version of a row.

Of course, Oracle's design is much more complicated in its ability to
build read-committed versions of the blocks at runtime; something the
simplicity of PostgreSQL's MVCC design eliminates.

> In a sense then it's the opposite of what we usually say. Oracle is optimized
> for mostly static data. Postgres is optimized for changing data.

Care to share an example to prove it?

Like always, there are pros and cons with both designs, but denying
facts gets us nowhere. We're off-topic now... so we should either
move this off line or to another thread. I personally don't see much
of a reason to continue discussing MVCC designs anymore as Oracle's is
patented and PostgreSQL's is highly unlikely to change drastically.

As always, I'd suggest discussing improvements, not the status quo.
Likewise, discussing Oracle's design, drawbacks, and limitations
without having used it extensively is quite obvious to anyone familiar
with Oracle. Don't get me wrong, it's fine to prefer one design to
another, but pushing discussion items comparing Oracle to PostgreSQL
because of things you've heard or read somewhere isn't the same as
understanding them because you've used them.

--
Jonah H. Harris, Software Architect | phone: 732.331.1324
EnterpriseDB Corporation | fax: 732.331.1301
33 Wood Ave S, 3rd Floor | jharris(at)enterprisedb(dot)com
Iselin, New Jersey 08830 | http://www.enterprisedb.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Rod Taylor 2007-02-20 15:42:28 Re: New feature request: FlashBack Query
Previous Message Simon Riggs 2007-02-20 15:08:04 Re: [HACKERS] HOT WIP Patch - version 2