Re: Article on MySQL vs. Postgres

From: JanWieck(at)t-online(dot)de (Jan Wieck)
To: PostgreSQL HACKERS <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Article on MySQL vs. Postgres
Date: 2000-07-04 22:39:04
Message-ID: 200007042239.AAA04753@hot.jw.home
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tim Perdue wrote:
> On wednesday or thursday, I'm going to be publishing my article on MySQL
> vs. Postgres on PHPBuilder.com.
>
> Before I do that I want to confirm the major problem I had w/postgres:
> the 8K tuple limit. When trying to import some tables from MySQL,
> postgres kept choking because MySQL has no such limit on the size of a
> row in the database (text fields on MySQL can be multi-megabyte).

I just committed the first portion of TOAST. Enabling lztext
fields to hold multi-megabytes too. But it's not the answer
to such big objects. I have plans to add an Oracle like
large object handling in a future version.

> I actually intended the article to be a win for Postgres, as I've used
> it and had good luck with it for such a long time, but if you look at
> the results below, it seems very positive for MySQL.

It's never a good plan to have an initial intention which of
the competitors should finally look good. It's visible
between the lines.

> Performace/Scalability:
>
> MySQL was About 50-60% faster in real-world web serving, but it crumbles
> under a real load. Postgres on the other hand scaled 3x higher than
> MySQL before it started to crumble on the same machine. Unfortunately,
> Postgres would probably still lose on a high-traffic website because
> MySQL can crank out the pages so much faster, number of concurrent
> connections is hard to compare. MySQL also seems to make better use of
> multiple-processor machines like the quad-xeon I tested on. Postgres
> never saturated all 4 processors as MySQL did.

The question in this case is "what is real-world web
serving"? To spit out static HTML pages loaded into a
database? To handle discussion forums like OpenACS with high
concurrency and the need for transactions?

Web applications differ in database usage as much as any
other type of application. From huge amounts of static, never
changing data to complex data structures with many
dependencies constantly in motion. There is no such one
"real world web scenario".

> Tools:
> MySQL has some nice admin tools that allow you to watch individual
> connections and queries as they progress and tools to recover from
> corruption. I haven't seem any similar tools for postgres.

Yepp, we need alot more nice tools.

> Long-term stability:
> Postgres is undoubtably the long-run winner in stability, whereas MySQL
> will freak out or die when left running for more than a month at a time.
> But if you ever do have a problem with postgres, you generally have to
> nuke the database and recover from a backup, as there are no known tools
> to fix index and database corruption. For a long-running postgres
> database, you will occasionally have to drop indexes and re-create them,
> causing downtime.

Not true IMHO. We had some problems with indices in the past.
But you can drop/recreate them online and someone running a
query concurrently might just use a sequential scan during
that time. All other corruptions need backup and recovery.
WAL is on it's way.

> Usability:
> Both databases use a similar command-line interface. Postgres uses
> "slash commands" to help you view database structures. MySQL uses a more
> memorable, uniform syntax like "Show Tables; Show Databases; Describe
> table_x;" and has better support for altering/changing tables, columns,
> and even databases.

Since professional application development starts with a data
design, such "describe" commands and "alter" features are
unimportant. The more someone needs them, the more I know
that he isn't well educated.

Productional installations don't need any "alter" command at
all. New features are developed in the development area,
tested with real life data in the test environment and moved
to the production server including a maybe required data
conversion step during a downtime.

24/7 scenarios require hot standby, online synchronized
databases with hardware takeover. All that is far away from
our scope by now.

> Features:
> Postgres is undoubtedly far, far more advanced than MySQL is. Postgres
> now supports foreign keys, which can help with referential integrity.
> Postgres supports subselects and better support for creating tables as
> the result of queries. The "transaction" support that MySQL lacks is
> included in Postgres, although you'll never miss it on a website, unless
> you're building something for a bank, and if you're doing that, you'll
> use oracle.

FOREIGN KEY doesn't help with referential integrity, it
guarantees it. No application must ever worry if it will
find the customer when it has a problem report. It does a
SELECT and has it or it would've never found the problem
report first - period.

And for big, functional expanding web sites, it does so even
if one of a dozen programmers forgot it once. If the
constraint says you cannot delete a customer who payed until
end of the year, the database won't let you, even if one of
the 7 CGI programs that can delete customers doesn't check.

Transactions are the base for any data integrity. Especially
in the web environment. Almost every web server I've seen has
some timeout for CGI, ADP, ASP or whatever they call it. As
soon as your page needs to update more than one table, you
run the risk of getting aborted just between, leaving the
current activity half done. No matter if a database supports
FOREIGN KEY. I could live without it, but transactions are
essential.

Fortunately the MySQL team has changed it's point of view on
that detail and made some noticeable advantage into that area
by integrating BDB. The lates BETA does support transactions
including rollback as they announced. As far as I see it, the
integration of BDB only buys them transactions, on the cost
of performance and maintainence efford. So the need for it
cannot be that small as you think.

Final notes:

I hate these "MySQL" vs. "PostgreSQL" articles that want to
say "this one is the better". Each one has it's advantages
and disadvantages. Both have a long TODO.

Your article might better analyze a couple of different
"real-world web services", telling what DB usage profile they
have and then suggesting which of the two databases is the
better choice in each case.

MySQL is a tool and PostgreSQL is a tool. But as with other
tools, a hammer doesn't help if you need a screw driver.

Please don't intend to tell anyone either of these databases
is "the best". You'd do both communities a bad job. Help
people to choose the right database for their current needs
and tell them to reevaluate their choice for the next project
instead of blindly staying with the same database. We'll end
up with alot of customers using both databases parallel for
different needs.

At the bottom line both teams share the same idea, open
source. Anyone who pays a license fee is a loss (looser?) for
all of us.

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck(at)Yahoo(dot)com #

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jan Wieck 2000-07-04 23:09:15 Re: [HACKERS] Re: Revised Copyright: is this more palatable?
Previous Message Brian Piatkus 2000-07-04 20:37:21 Case sensitivity