Re: Comparison of PGSQL and DB2

From: Josh Berkus <josh(at)postgresql(dot)org>
To: Shashank Joshi <shashyajoshi(at)yahoo(dot)com>
Cc: pgsql-advocacy(at)postgresql(dot)org
Subject: Re: Comparison of PGSQL and DB2
Date: 2004-03-09 18:22:24
Message-ID: 200403091022.24024.josh@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-advocacy

Shashank:

What follows is my reply to IBM's 2-year-old feature comparison of PostgreSQL
and DB2. Each of the entries is an item that DB2 has and claims that
PostgreSQL does not. I've put my replies to these claims.

> On Demand Log Archiving
> (Allows a user to close and archive the active log of a recoverable atabase
whenever necessary, and use those log files to update a standby database.)
> Incremental Backup
> (Allows a user to perform interim backups that capture database changes
since the last successful full (or incremental) backup. Minimizes backup
window for large databases.)
> Incremental Online Backups
> (Ability to perform 'delta' backups while database is in use.)

IBM is correct. We do not currently have the above features. However, they
are all expected in or very soon after the release of 7.5.

> Dual Logging
> (Mirrors the active log files, which help to protect databases from
accidental deletion of an active log and data corruption caused by hardware
failure.)

This "feature" seems redundant with the previous 3 features; I'm not sure what
IBM is getting at.

> High Availability Support from Vendors.
> (Third party.)

We have this, in quite a variety of forms. PGSQL Inc., Command Prompt, SRA
Inc., Linux Labs, Veritas, and a couple of companies in Europe whose names
escape me at the moment, all have some kind of HA PostgreSQL support.

> Dynamic Storage Allocation
> (Ability to add storage capacity to the database while the database is in
use.)

This is a "misfeature". PostgreSQL does not have "Dynamic Storage
Allocation" becuase we don't have the filesytem access issues that DB2 does
-- PostgreSQL can expand to fill the host filesystem, no allocation required.
Or, to look at it another way, we've had this since Postgres 1.0. This is
definitely an area in which we are superior to DB2.

> Automatic Integrity Checking
> (Automatic checking of the structural integrity of data pages when the data
is accessed by an application and during
> backup.)

Another "feature" we don't have becuase it's not needed. We're more
interested in preventing corruption in the first place than finding out if
it's happened.

> Encryption Functions
> (Encryption and decryption functions with a key strength of 128 bits.)

We've had these for at least 3 versions.

> Group Authentication
> (Authentication by user groups & roles, rather than by username for more
manageability.)

Once again, groups are something we've had for several versions. We don't
support "roles" becuase they are redundant to groups.

> Raw Device Support
> (Ability to leverage raw-disk access, bypassing the overhead of the
underlying operating system.
> (Note: May be supported soon))

This, in our opinion, is another "misfeature". Raw Filesystem access is
either a crutch for DBMSes which don't know how to use their host OS
properly, or a legacy of older server OSes which did not support good
filesystems. It is not useful for modern databases running on modern OSes.

> Automatic Summary Tables
> (Transparently provide the pre-aggregation/summarization of data to
dramatically improve the performance of transactions or queries that require
such data.)

Under development, possibily available with 7.5. These are otherwise know as
"materialized views".

> User Defined Task Prioritization
> (Ability to define rules on how the database should prioritize workloads
either by query cost or user profiles. Ability to monitor resource usage and
adjust the priority of queries or cancel runaway queries that exceed
predefined limits.)

This would be nice. We're not sure how to implement it; don't expect it soon
for PostgreSQL.

> Isolation Levels
> (Support all four ANSI isolation levels (UR, CS, RS, RR).)

I'm not sure about this one; I suspect that we do, however, since MVCC,
invented for the Postgres Project, has become a standard for transaction
isolation in the database industry.

> Clustered Indexes
> (Allows data records to be clustered on pages based on the sequence of a
particular index. Clustering increases the efficiency of data retrieval when
it involves accessing values by range.)

We've had these for at least 3 versions.

> Advanced Memory Management
> (Ability to pre-allocate memory for data required by applications before it
is needed (pre-fetching). Ability for considerable memory availability in
choosing access plan to resolve queries.)

Another "feature" caused by DB2's ignorance of the host OS. We *are* working
on making Postgres' use of RAM and I/O more efficient. However, I suspect
that it is already better than DB2's use.

> Dynamic Bit Map Indexing
> (Warehouse queries typically involve many AND predicates. Dynamic bit-map
technology efficiently combines multiple indexes for improved performance.)

Aha! Yes, this would be nice. We don't have it, under development or
otherwise. We'd like it.

> Bi-Directional Indexes
> (Ability to perform both forward and reverse scans on a single index.
Improves the performance of reverse scans by eliminating the need to use
temporary tables.)

<laugh> This is a *feature*? I call it a bug fix. We've had this since
7.1. -- BEFORE DB2.

For that matter, we support MANY more *types* of indexes thann DB2 does, from
my brief check.

> Data Partitioning
> (Ability to divide database into partitions, each of which can be stored and
managed on any combination of multiple logical partitions (for large SMP
machines) or physical partitions on separate machines in a "shared-nothing"
massively parallel
> hardware platform.)

This is under development and will (probably) be released with 7.5, for
seperate tables. Currently, an ad-hoc solution is possible for any talented
DBA/sysadmin, but is somewhat hard to administer. Partitioning individual
tables has been discussed but there is not yet any development code.

> SMP Support for complex queries
> (Ability to divide up work of a single complex/large query among processors
in a multi-processor environment to resolve subqueries in parallel.)

Nope, don't have this; another thing that would be nice to have.

> Parallel Backup / Restore
> (Ability to perform backup/restore of large databases faster by leveraging
all the processors in a multi-processor machine.)

This is directly related to the above feature.

> Mass Data Load
> (Ability to perform data load into the database faster by leveraging all
processors in a multi-processor machine.)

Same.

> Federated Database Support
> (Ability to allow applications to access & perform JOIN operations on
multiple disparate databases.)

This is a feature which we do not have because it violates the ANSI SQL
Specification.

> Built-in OLE DB Support
> (Allows for access to data from another database that supports the Microsoft
OLE DB standard.)

Don't have it, nor are likely to. Our interface libraries are strictly
seperate from the core program, and should and will remain so. Currently we
support ODBC and .Net access for MS tools. OLE-DB is under development.

> DRDA Support
> (Support for DRDA (industry standard communications protocol for relational
databases).)

Not sure. This is the first time I've heard of this standard. We may support
it through a 3rd-party project or library.

> MQSeries Integration
> (Features to reduce effort required to integrate database with MQSeries
Infrastructure.)

Never heard of MQSeries before, either. See above.

> J2EE Certified
> (Independently certified to be J2EE compliant.)

Um, why would a database be "J2EE certified"? We *are* talking about an
RDBMS here, not a JVM.

> SQLJ
> (Allows developing Java applications with embedded static SQL.)

We don't have SQLJ because, to be frank, it's a worthless hybrid. We
currently have two different implementations of procedural Java which can be
used inside the database, either of which is superior in design to SQLJ,
although I'm not sure of their production status.

> FIPS/SQL92E Support
> (Complies with FIPS/SQL92E standard.)

Not certain about this one. We may support it, we may not.

> Database XML Tools
> (XML tools to easily store, retrieve, and search the content of XML
documents. Ability to map XML definitions to relational tables to dynamically
compose and decompose XML documents on demand.)

Our support for XML is through 3rd-party projects and plug-ins. As I do not
do a lot of XML-to-DB, I cannot relate the exact status of these projects.

> Java UDFs
> (Support for cross platform User Defined Functions.)
> Java Stored Procedures
> (Support for cross platform Stored Procedures.)

We have PL/java and PL/J to do both of the above things. My personal
knowledge is insufficient to compare our implementation with DB2s.

For that matter, we support 10 Procedural Languages *in addtion to* Java.
These include Perl, Python, Ruby, R, Tcl, and PHP. DB2 supports, AFAIK,
only Java and a SQL-scripting dialect.

> WW 24x7 Support

We have a few companies that offer 24x7 support in specific areas. Certainly
IBM has a leg up on us for offering worldwide support -- especially since
they support PostgreSQL in some locations!

--
Josh Berkus
PostgreSQL Advocacy Volunteer
http://advocacy.postgresql.org/
San Francisco

Responses

Browse pgsql-advocacy by date

  From Date Subject
Next Message Robby Russell 2004-03-09 19:24:01 Re: How many Libraries of Congress can PostgreSQL
Previous Message Richard Huxton 2004-03-09 09:27:53 Re: How many Libraries of Congress can PostgreSQL