Re: Issues with \copy from file

From: Sigurgeir Gunnarsson <sgunnars(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Issues with \copy from file
Date: 2009-12-18 15:51:48
Message-ID: ebd3ad520912180751q26d770d0gad7f4a516d9e3b53@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

The intention was never to talk down postgresql but rather trying to get
some explanation of this difference so that I could do the proper changes.

After having read the link from Euler's post, which I oversaw, I have
managed to shorten the import time. My problem was with the indexes. I was
able to shorten the import time, of a 26 million line import, from 2 hours +
(I gave up after that time) downto 12 minutes by dropping the indexes after
truncate and before copy.

This is what I was expecting and I'm therefore satisfied with the result.

Regards, Sigurgeir

2009/12/18 Robert Haas <robertmhaas(at)gmail(dot)com>

> On Fri, Dec 18, 2009 at 7:46 AM, Sigurgeir Gunnarsson
> <sgunnars(at)gmail(dot)com> wrote:
> > I hope the issue is still open though I haven't replied to it before.
> >
> > Euler mentioned that I did not provide any details about my system. I'm
> > using version 8.3 and with most settings default on an old machine with 2
> GB
> > of mem. The table definition is simple, four columns; id, value, x, y
> where
> > id is primary key and x, y are combined into an index.
> >
> > I'm not sure if it matters but unlike Euler's suggestion I'm using \copy
> > instead of COPY. Regarding my comparison to MySQL, it is completely
> valid.
> > This is done on the same computer, using the same disk on the same
> platform.
> > From that I would derive that IO is not my problem, unless postgresql is
> > doing IO twice while MySQL only once.
> >
> > I guess my tables are InnoDB since that is the default type (or so I
> think).
> > BEGIN/COMMIT I did not find change much. Are there any other suggestions
> ?
>
> Did you read Matthew Wakeling's reply? Arranging to skip WAL will
> help a lot here. To do that, you need to either create or truncate
> the table in the same transaction that does the COPY.
>
> The problem with the MySQL comparison is that it's not really
> relevant. It isn't that the PostgreSQL code just sucks and if we
> wrote it properly it would be as fast as MySQL. If that were the
> case, everyone would be up in arms, and it would have been fixed long
> ago. Rather, the problem is almost certainly that it's not an
> apples-to-apples comparison. MySQL is probably doing something
> different, such as perhaps not properly arranging for recovery if the
> system goes down in the middle of the copy, or just after it
> completes. But I don't know MySQL well enough to know exactly what
> the difference is, and I'm not particularly interested in spending a
> lot of time figuring it out. I think you'll get that reaction from
> others on this list as well, but of course that's up to them.
> Everybody here is a volunteer, of course, and generally our interest
> is principally PostgreSQL.
>
> On the other hand, we can certainly give you lots of information about
> what PostgreSQL is doing and why that takes the amount of time that it
> does, or give you information on how you can find out more about what
> it's doing.
>
> ...Robert
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Michael N. Mikhulya 2009-12-18 16:18:10 Re: Idea how to get rid of Bitmap Heap Scan
Previous Message Matthew Wakeling 2009-12-18 15:51:11 Re: Idea how to get rid of Bitmap Heap Scan