Re: Importing Large Amounts of Data

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Curt Sampson <cjs(at)cynic(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Importing Large Amounts of Data
Date: 2002-04-16 01:44:26
Message-ID: 200204160144.g3G1iQV21731@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Curt Sampson wrote:
> On Mon, 15 Apr 2002, Tom Lane wrote:
>
> > > I'm not looking for "runs a bit faster;" five percent either way
> > > makes little difference to me. I'm looking for a five-fold performance
> > > increase.
> >
> > You are not going to get it from this; where in the world did you get
> > the notion that data integrity costs that much?
>
> Um...the fact that MySQL imports the same data five times as fast? :-)
>
> Note that this is *only* related to bulk-importing huge amounts of
> data. Postgres seems a little bit slower than MySQL at building
> the indicies afterwards, but this would be expected since (probably
> due to higher tuple overhead) the size of the data once in postgres
> is about 75% larger than in MySQL: 742 MB vs 420 MB. I've not done
> any serious testing of query speed, but the bit of toying I've done
> with it shows no major difference.

Can you check your load and see if there is a PRIMARY key on the table
at the time it is being loaded. In the old days, we created indexes
only after the data was loaded, but when we added PRIMARY key, pg_dump
was creating the table with PRIMARY key then loading it, meaning the
table was being loaded while it had an existing index. I know we fixed
this recently but I am not sure if it was in 7.2 or not.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Neil Conway 2002-04-16 01:49:37 Re: Importing Large Amounts of Data
Previous Message Tom Lane 2002-04-16 01:37:12 Re: YADP - Yet another Dependency Patch