Re: why postgresql over other RDBMS

From: Andrew Sullivan <ajs(at)crankycanuck(dot)ca>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: why postgresql over other RDBMS
Date: 2007-05-25 21:44:19
Message-ID: 20070525214419.GB1790@phlogiston.dyndns.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Fri, May 25, 2007 at 05:28:43PM -0400, Tom Lane wrote:
> That's true at the level of DDL operations, but AFAIK we could
> parallelize table-loading and index-creation steps pretty effectively
> --- and that's where all the time goes.

I made a presentation at OSCON a few years ago about how we did it
that way when we imported .org. We had limited time to work in, and
we had to do a lot of validation, so getting the data in quickly was
important. So we split the data files up into segments and loaded
them in parallel (Chris Browne did most of the implementation of
this.) It was pretty helpful for loading, anyway.

> A more interesting question is what sort of hardware you need for that
> actually to be a win, though. Loading a few tables in parallel sounds
> like an ideal recipe for oversaturating your disk bandwidth...

Right, you need to be prepared for that. But of course, if you're in
the situation where you have to get a given database up and running,
who cares about the disk bandwidth? -- you don't have the database
running yet. The kind of system that is busy enough to have that
size of database and that urgency of recovery is also the kind that
is likely to have dedicated storage hardware for that database.

A

--
Andrew Sullivan | ajs(at)crankycanuck(dot)ca
Unfortunately reformatting the Internet is a little more painful
than reformatting your hard drive when it gets out of whack.
--Scott Morris

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Michael Harris (BR/EPA) 2007-05-25 23:26:50 Re: ERROR: cache lookup failed for type 0
Previous Message Tom Lane 2007-05-25 21:28:43 Re: why postgresql over other RDBMS