Re: Notes on converting from MySQL 5.0.x to PostgreSQL

From: Scott Marlowe <smarlowe(at)g2switchworks(dot)com>
To: Jason McManus <mcmanus(dot)jason(at)gmail(dot)com>
Cc: pgsql general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Notes on converting from MySQL 5.0.x to PostgreSQL
Date: 2006-06-30 15:12:14
Message-ID: 1151680333.13129.55.camel@state.g2switchworks.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-www

I agree with Tom, nice notes. I noted a few minor issues that seem to
derive from a familiarity with MySQL. I'll put my corrections below...

On Fri, 2006-06-30 at 08:17, Jason McManus wrote:
> On Converting from MySQL 5.0.x to PostgreSQL 8.1.x
> --------------------------------------------------
> Major differences I have noted:
> -------------------------------
>
> MySQL 5.0.x:

> * Easy, built-in and extensive replication support.

Not sure how extensive it is. It's basically synchronous single master
single slave, right? It is quite easy though.

> PostgreSQL 8.1.x:
> * Embedded procedures in multiple native languages (stored procedures and
> functions can be written in native Perl, Python, Tcl, SQL, PL/PgSQL)

Note that there are a dozen or more other languages as well. Just FYI.
Off the top of my head, plPHP, plJ (java there's two different java
implementations, I think) and plR (R is the open source equivalent of
the S statistics language)

> * Replication support still rudimentary.

Hmmmm. I think that's an overly simplistic evaluation. The slony
replication engine is actually VERY advanced, but the administrative
tools consist mostly of "your brain". hehe. That said, once you've
learned how to drive it, it's quite amazing. Keep in mind, slony can be
applied to a living database while it's running, and can run between
different major versions of postgresql. That's a pretty advanced
feature. Plus, if the replication daemons die (kill -9ed or whatever)
you can restart replication and slony will come right back where it was
and catch up.

> Pointers, tips, quick facts and gotchas for other people converting:
> --------------------------------------------------------------------
>
> * MySQL combines the concepts of 'database' and 'schema' into one. PostgreSQL
> differentiates the two. While the hierarchy in MySQL is
> database.table.field, PostgreSQL is roughly: database.schema.table.field.
> A schema is a 'logically grouped set of tables but still kept within a
> particular database.' This could allow separate applications to be built
> that still rely upon the same database, but can be kept somewhat logically
> separated. The default schema in each database is called 'public', and is
> the one referred to if no others are specified. This can be modified with
> 'SET search_path TO ...'.

This is a VERY good analysis of the difference between the two
databases.

> * Pg uses a 'template1' pseudo-database that can be tailored to provide
> default objects for new database creation, if you should desire. It
> obviously also offers a 'template0' database that is read-only and
> offers a barebones database, more equivalent to the empty db created with
> mysql's CREATE DATABASE statement.

This isn't quite right.

template0 is a locked and "pure" copy of the template database. It's
there for "break glass in case of emergency" use. :)

template1, when you first initdb, is exactly the same as template0, but
you can connect to it, and alter it. Both of these are "real"
postgresql databases. template1 is the database that gets copied by
default when you do "create database". Note that you can also define a
different template database when running create database, which lets you
easily clone any database on your machine. "create database newdb with
template olddb"

> * Pg uses the 'serial' column type instead of AUTO_INCREMENT. This allows
> more than one independent sequence to be specified per table (though the
> utility of this may be of dubious value). These are closer to Oracle's
> concept of sequence generators, and they can be manipulated with the
> currval(), nextval(), setval(), and lastval() functions.

Don't forget 64bit bigserials too.

> * Pg requires its tables and databases be 'vacuumed' regularly to remove
> completed transaction snapshots and optimize the tables on disk. It is
> necessary because the way that PostgreSQL implements true MVCC is by
> writing all temporary transactions to disk and setting a visibility
> flag for the record. Vacuuming can be performed automatically, and in
> a deferred manner by using vacuum_cost settings to limit it to low-load
> periods or based upon numerous other criteria. See the manual for more
> information.

Interestingly enough, MySQL's innodb tables do almost the exact same
thing, but their vacuum process is wholly automated. Generally, this
means fewer issues pop up for the new dba, but when they do, they can be
a little harder to deal with. It's about a wash. Of course, as you
mentioned earlier, most mysql folks aren't using innodb.

> * While MySQL supports transactions with the InnoDB databases, many MySQL
> users generally do not use them extensively enough. With Pg, due to the
> behaviour of the server in attempting to ensure data integrity in a
> variety of situations (client disconnection, network trouble, server
> crashes, etc.), it is highly advisable to become familiar and utilize
> transactions a lot more, to ensure your DATA is left in a consistent state
> before and after every change you wish to make.

A point you might want to throw in here is that EVERYTHING in postgresql
is a transaction. If you don't issue a begin statement, then postgresql
runs each statement you type in inside its own transaction.

This means that inserting 10,000 rows without wrapping them inside an
explicit transaction results in 10,000 individual transactions.

However, the more interesting thing here, is that every statement,
including DDL is transactable, except for a couple of big odd ones, like
create database. So, in postgresql, you can do:

begin;
create table xyz...
alter table abc...
insert into abc select * from iii
update iii...;
drop table iii;
(oops, I messed up something)
rollback;

and there's no change and no lost data. Quite impressive actually.

> Common equivalents:
> -------------------
>
> MySQL PostgreSQL
> ----- -----------
> OPTIMIZE TABLE ... VACUUM ...

vacuum and analyze for optimize I think. Also, possibly reindex,
although nominally that's the "sledge hammer" of optimization.

One last thing I'd mention that I REALLY like about PostgreSQL over any
other database I've used is that the psql interface has a complete
syntax lookup feature that is WAY cool. \h brings it up, and \h COMMAND
where COMMAND is the command you want to look up will bring up the
syntax for your command.

And, I hate the fact that CTRL-C in the mysql command line tool exits
the tool instead of interrupting the current query. In PostgreSQL it
interrupts the current query. CTRL-\ will kill the client if you need
to.

Overall, a great review. Thanks.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Alan Bullock 2006-06-30 15:13:12 limit results to one row per foreign object
Previous Message Antonis Christofides 2006-06-30 15:08:29 Re: User privileges in web database applications

Browse pgsql-www by date

  From Date Subject
Next Message Dave Page 2006-06-30 15:16:16 Re: Notes on converting from MySQL 5.0.x to PostgreSQL 8.1.4
Previous Message Tom Lane 2006-06-30 14:47:26 Re: Notes on converting from MySQL 5.0.x to PostgreSQL 8.1.4