Postgres vr.s MySQL- style differences?

From: Brian Hurt <bhurt(at)janestcapital(dot)com>
To: pgsql-advocacy(at)postgresql(dot)org
Subject: Postgres vr.s MySQL- style differences?
Date: 2007-05-28 14:29:59
Message-ID: 465AE767.7030809@janestcapital.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-advocacy


I'm still new to databases (though not to computing in general), and
I'll admit that I haven't ever really used MySQL. But listening in to
the various conversations comparing MySQL to Postgres, one recurring
theme I've seen is that accessing the different databases requires
different styles. And that many of the problems in switching from one
database to the other is mainly a matter of problems with the style.
Especially when the style is encoded in software.

Take, for example, inserting large numbers of records into a table. As
someone who was raised on Postgres (so to speak), when loading large
numbers (more than 10 or so) records into a table I immediate start
reaching for a copy command. And our code base is littered with copies-
from simple programs that given a CSV file and a table name burst-load a
table, to complicated libraries that use multithreaded queues and switch
between copies and transactional inserts depending upon how much data
needs to be copied. But someone used to MySQL would just start firing
off lone, non-transactional insert statements, and their code base
wouldn't contain such programs or libraries.

Or take transactions. Our code operates on the assumption that the
database handles concurrency just fine. And that the code can feel free
to haul off and do whatever it likes- insert, delete, update, or select
whatever data it feels like- whenever it feels like, without regard to
whoever else is accessing the same data. Don't worry, the database will
sort it all out for you. I'd be terrified to take this code base to a
non-transactional database, because I have no idea where the
synchronization issues are- they haven't been a performance issue, so no
thought whatsoever has been given to them beyond "let the database
handle it". But I don't think there are many, or that they'd be that
hard to code around- if the code base had been written from the get go
knowing that we couldn't rely on the database for concurrency,
adjustments could be made.

As a side note, given a choice between having to specially handle
inserts vr.s specially handle synchronization, I'll take specially
handling inserts any day.

Or take the multi-table joins. I used to laugh at the idea of 27-table
joins, until I found one in our code. I was doing a complicated query
against a view, which was joining together several other views, which
were joining together several other views, which were joining together
views where were joining together tables. No, I'm not exagerating for
comedic effect, the views really did go four layers deep. Why? Because
Postgres just handled it, so there was no real pushback for making views
of views of views of views. I only noticed because the complicated
query was somewhat slow (not intolerably slow, just somewhat slow) and
went investigating why. If we were working with a database that didn't
handle multi-table joins nearly so well, we probably would have
redesigned the table structure at some point to limit the depth of the
views, and the number of tables in the query.

My point here is to try to explain why moving code between MySQL and
Postgres, in either direction, always seems fraught with difficulty.
And why the new database always seems to be of less value than the old
database. If you've already spent the time and effort to avoid
synchronization problems outside the database, and to limit the number
of tables you're joining in your queries, then not having to do these
things has zero value. On the other hand, if you haven't already
special cased inserts, having to now go back and special case inserts is
a cost. Likewise, if you've already paid the cost to use copy instead
of inserts on inserting, not having to do so is a minor advantage- but
if you have given little thought to limiting the number of tables you
join (barring performance problems), or synchronization issues (beyond
an occassional begin and commit), you're looking at a large cost on
switching.

Just my $0.02.

Brian

Responses

Browse pgsql-advocacy by date

  From Date Subject
Next Message Andrew Sullivan 2007-05-28 15:01:07 Re: Postgres vr.s MySQL- style differences?
Previous Message Alvaro Herrera 2007-05-25 22:44:11 Re: Is MySQL fit for the enterprise?