On Converting from MySQL 5.0.x to PostgreSQL 8.1.x
I am in the process of converting a couple of major sites from MySQL 5.0.22
to PostgreSQL 8.1.4, and I thought I would share some of my observations on
this process and the two database systems in general. I feel I am in a good
position to do this, as I am fairly familiar in-depth with MySQL's modern
features, am relatively technology-agnostic, and having done this since
the MySQL project has matured substantially over the last couple of years, I
feel I can give insight into the features of both systems from a current
I will attempt to do this from an objective standpoint, as I still see
benefits and drawbacks to both systems (no software will ever meet the needs
of every situation). If you are looking for a document that mocks one system
or another, this is not it; there are countless results for "X sucks" on
My reasons for making this switch are primarily due to having a bit of spare
time, wanting to expand my horizons and familiarize myself with another
well-respected open source project, some of the nice in-built procedural
language features of PostgreSQL, and basically wanting to form my own
opinion of the features of both systems. That being said, I have really
been impressed so far with the features I am discovering, and becoming
happier each day.
It is by no means comprehensive; I have just highlighted some of the
more notable and obvious differences as I discovered them. Most of the
information in here is pretty basic for those of you very familiar with
both systems. I also only highlight the differences in the Unix/POSIX
versions of these programs. (I don't run Windows as a server, and I don't
care to. My technology agnosticism stops at inadequate systems.)
All comments are current as of the time of writing (Spring/Summer 2006).
Error corrections are very welcome. Flames can go to /dev/null. Don't care.
Oh, and since I wrote this in vim, it is best read with a monospace font. :^)
Major differences I have noted:
* Multiple storage engines with different features.
* Supports multi-insert syntax (INSERT INTO foo VALUES (1,2), (3,4) ...)
* A few more access controls on features built-in to the GRANT tables. Many
of these are still present, but implemented in other ways in PostgreSQL.
* Single AUTO_INCREMENT column allowed per table.
* Easy, built-in and extensive replication support.
* Single datastore location per server.
* ALL Stored Procedures are kept in the mysql system database.
* Embedded procedures in multiple native languages (stored procedures and
functions can be written in native Perl, Python, Tcl, SQL, PL/PgSQL)
* Extensive and versatile procedural language functionality.
* User-definable data types and operators.
* Multiple sequence generators allowed per table.
* Replication support still rudimentary.
* Stored procedures are kept (somewhat more logically, imho) in the
* Multiple datastore locations possible using tablespaces concept.
(For the record, MySQL will have tablespaces when 5.1.x is stabilized.)
* Most system variables, "built-in" types and features configurable as they
are just kept in a system catalog.
* Allows deletions and subselects to specify the same table (e.g. DELETE
FROM foo WHERE id IN (SELECT id FROM foo WHERE...) ). MySQL does not
allow this as of 5.0.22.
* Copious documentation on the database internals, for extending the
Pointers, tips, quick facts and gotchas for other people converting:
* Don't bother using an old version, just go for 8.1.4 (or whatever is new
at the time of your conversion. This should be common sense.)
* Since Pg uses a full transactional storage engine, the speed is roughly
comparable to InnoDB, rather than the stock MyISAM format.
* PostgreSQL's TCP port is 5432 by default.
* The main server process on PostgreSQL is 'postmaster'.
* 'postmaster' can be controlled via the 'pg_ctl' command.
* The administrative user is called 'postgres' by default.
* Like MySQL, Pg uses the system user as default, if no username is
specified when connecting.
* The command-line client is called 'psql'.
* PostgreSQL by default comes configured to disallow network connections.
To enable these, you must follow these steps:
1. Edit $DATADIR/pg_hba.conf and add access permissions.
2. Edit $DATADIR/postgresql.conf and uncomment the listen_addresses
line, setting it to something reasonable.
3. Restart postmaster.
* PostgreSQL relies extensively upon quick aliases for common features within
the CLI shell. MySQL offers many similar features, but they aren't used
as much from what I have observed.
* MySQL combines the concepts of 'database' and 'schema' into one. PostgreSQL
differentiates the two. While the hierarchy in MySQL is
database.table.field, PostgreSQL is roughly: database.schema.table.field.
A schema is a 'logically grouped set of tables but still kept within a
particular database.' This could allow separate applications to be built
that still rely upon the same database, but can be kept somewhat logically
separated. The default schema in each database is called 'public', and is
the one referred to if no others are specified. This can be modified with
'SET search_path TO ...'.
* Pg uses a 'template1' pseudo-database that can be tailored to provide
default objects for new database creation, if you should desire. It
obviously also offers a 'template0' database that is read-only and
offers a barebones database, more equivalent to the empty db created with
mysql's CREATE DATABASE statement.
* Pg's ROLEs can specify a single user or a group, and be nested to contain
* Pg's default character set (in 8.1.4) is UTF8.
* Pg uses the 'serial' column type instead of AUTO_INCREMENT. This allows
more than one independent sequence to be specified per table (though the
utility of this may be of dubious value). These are closer to Oracle's
concept of sequence generators, and they can be manipulated with the
currval(), nextval(), setval(), and lastval() functions.
* Pg requires its tables and databases be 'vacuumed' regularly to remove
completed transaction snapshots and optimize the tables on disk. It is
necessary because the way that PostgreSQL implements true MVCC is by
writing all temporary transactions to disk and setting a visibility
flag for the record. Vacuuming can be performed automatically, and in
a deferred manner by using vacuum_cost settings to limit it to low-load
periods or based upon numerous other criteria. See the manual for more
* Kept internally in Pg, there is a concept called the OID, which is a
continuously incremented number used to assign unique IDs to system
objects. This allows the database to store and refer uniquely to user
operators, new databases, basically anything that the system needs to
refer to in the 'data directory', regardless of user-defined names.
* Most administrative procedures will refuse to run as root, and require you
to su to the 'postgres' system user to perform the action.
* PgAdminIII gives you a great overview of the hierarchy of system objects
throughout the server. Even though you may administrate your server
primarily via the CLI, as I do, it is still valuable during the learning
process to use this tool to browse around the various objects, to learn
the system setup and hierarchy visually.
* While MySQL supports transactions with the InnoDB databases, many MySQL
users generally do not use them extensively enough. With Pg, due to the
behaviour of the server in attempting to ensure data integrity in a
variety of situations (client disconnection, network trouble, server
crashes, etc.), it is highly advisable to become familiar and utilize
transactions a lot more, to ensure your DATA is left in a consistent state
before and after every change you wish to make.
* There is a conversion utility called 'mysql2pgsql' that will convert
dump files from the mysqldump format, to a format that psql can
understand. It is available at:
* To turn on query time output, similar to the mysql CLI, use the '\timing'
command from psql. (Note that the time is displayed in milliseconds,
whereas in the mysql client it is displayed in seconds.)
mysql database system tables
SHOW DATABASES; \l
SHOW GRANTS; \du
SHOW TABLES; \dt
DESC tblname; \d foo
USE dbname; \c dbname
ALTER TABLE foo AUTO_INCREMENT = n; SELECT setval('seq_name',n);
SHOW PROCESSLIST; SELECT * FROM pg_stat_activity;
OPTIMIZE TABLE ... VACUUM ...
Overall, I have been happy and very impressed with the features offered by
PostgreSQL 8.1.4, and believe that I will be using it for the majority of my
future projects. There are still some niches where I see utility for the
pluggable storage engines and raw speed of MySQL. I will be keeping my eye
on that project closely, also, as I want to test out the new Scheduled
Events feature from the 5.1 series once it is stabilized.
I have also been using Apress' "Beginning Databases with PostgreSQL, 2nd
Edition" (2005) as my learning material. I briefly examined Sams'
"PostgreSQL - The Comprehensive Guide, 2nd Ed." (2005) and Pearson's
"PostgreSQL: Introduction and Concepts" (2001), but found the Apress book
the best of the 3 that I had access to. YMMV. More info in the book
reviews linked below.
PostgreSQL manual: http://www.postgresql.org/docs/manuals/
PostgreSQL book reviews:
Community Support Channel: irc.freenode.net in #postgresql
Varlena Consulting's General Bits archives:
Thank you, and I hope that these notes prove helpful to others!
pgsql-www by date
|Next:||From: Tom Lane||Date: 2006-06-30 14:17:10|
|Subject: Re: Notes on converting from MySQL 5.0.x to PostgreSQL 8.1.4 |
|Previous:||From: Selena Deckelmann||Date: 2006-06-30 00:02:21|
|Subject: pdxpug MT site - mod permissions?|
pgsql-general by date
|Next:||From: Martijn van Oosterhout||Date: 2006-06-30 13:21:42|
|Subject: Re: pg_restore: [archiver] could not open input file|
|Previous:||From: Michael Fuhr||Date: 2006-06-30 13:08:20|
|Subject: Re: finding gps within polygon|