Re: Article on MySQL vs. Postgres

From: Benjamin Adida <ben(at)mit(dot)edu>
To: Tim Perdue <tperdue(at)valinux(dot)com>, <pgsql-hackers(at)hub(dot)org>
Subject: Re: Article on MySQL vs. Postgres
Date: 2000-07-04 23:37:25
Message-ID: B587ED74.6794%ben@mit.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

on 7/4/00 3:42 PM, Tim Perdue at tperdue(at)valinux(dot)com wrote:

> Before I do that I want to confirm the major problem I had w/postgres:
> the 8K tuple limit. When trying to import some tables from MySQL,
> postgres kept choking because MySQL has no such limit on the size of a
> row in the database (text fields on MySQL can be multi-megabyte).

It's possible in the current version to up your tuple limit to 16K before
compilation, and you can use lztext, the compressed text type, which should
give you up to 32K of storage. Netscape's textarea limit is 32K, so that's a
good basis for doing a number of web-based things. Anything that is
multi-megabyte is really not something I'd want to store in an RDBMS.

> I actually intended the article to be a win for Postgres, as I've used
> it and had good luck with it for such a long time, but if you look at
> the results below, it seems very positive for MySQL.

Jan said that each tool has its value, and that's true. I recommend you
define your evaluation context before you write this. Is this for running a
serious mission-critical web site? Is it for logging web site hits with
tolerance for data loss and a need for doing simple reporting?

> Performace/Scalability:
>
> MySQL was About 50-60% faster in real-world web serving, but it crumbles
> under a real load. Postgres on the other hand scaled 3x higher than
> MySQL before it started to crumble on the same machine. Unfortunately,
> Postgres would probably still lose on a high-traffic website because
> MySQL can crank out the pages so much faster, number of concurrent
> connections is hard to compare. MySQL also seems to make better use of
> multiple-processor machines like the quad-xeon I tested on. Postgres
> never saturated all 4 processors as MySQL did.

What kind of queries did you perform? Did you use connection pooling (a lot
of PHP apps don't, from what I've seen)? How does the performance get
affected when a query in Postgres with subselects has to be split into 4
different queries in MySQL? Postgres is process-based, each connection
resulting in one process. If you use connection pooling with at least as
many connections as you have processors, you should see it scale quite well.
In fact, for serious load-testing, you should have 10-15 pooled connections.

I *strongly* question your intuition on Postgres running web sites. MySQL's
write performance is very poor, which forces excessive caching (see sites
like Slashdot) to prevent updates from blocking entire web site serving.
Yes, the BDB addition might be useful. Let's see some performance tests
using BDB tables.

> Postgres is undoubtably the long-run winner in stability, whereas MySQL
> will freak out or die when left running for more than a month at a time.
> But if you ever do have a problem with postgres, you generally have to
> nuke the database and recover from a backup, as there are no known tools
> to fix index and database corruption. For a long-running postgres
> database, you will occasionally have to drop indexes and re-create them,
> causing downtime.

Dropping indexes and recreating them does not cause downtime. I've run a
couple of postgres-backed web sites for months on end with no issues. I've
survived a heavy slashdotting on my dual Pentium II-400, with Postgres
WRITES and READS on every Slashdot-referred hit, resulting in perfectly
respectable serving times (less than 3-4 seconds to serve > 20K of data on
each hit). No caching optimization of any kind on the app layer. And I'd
forgotten to vacuum my database for a few days.

> Features:
> Postgres is undoubtedly far, far more advanced than MySQL is. Postgres
> now supports foreign keys, which can help with referential integrity.
> Postgres supports subselects and better support for creating tables as
> the result of queries. The "transaction" support that MySQL lacks is
> included in Postgres, although you'll never miss it on a website, unless
> you're building something for a bank, and if you're doing that, you'll
> use oracle.

I'm just shocked at this. Where did this "transactions aren't necessary"
school of thinking originate? I've been developing database-backed web sites
for 5 years now, and I can't conceive of building a serious web site without
transactions. How do you guarantee that a record and its children records
are all stored together successfully? Do you run on a magic power grid that
never fails? Do you never have code-related error conditions that require
rolling back a series of database edits?

One quick point: while you may well be personally unbiased, VA Linux just
endorsed and funded MySQL. SourceForge uses MySQL. How do you expect to
convince readers that you're being objective in this comparison?

-Ben

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Chris Bitmead 2000-07-04 23:42:38 Re: heap_create with OID?
Previous Message Jan Wieck 2000-07-04 23:09:15 Re: [HACKERS] Re: Revised Copyright: is this more palatable?