Re: COPY into table too slow with index: now an I/O question

From: "Rick Schumeyer" <rschumeyer(at)ieee(dot)org>
To: "'Luke Lonergan'" <llonergan(at)greenplum(dot)com>, <pgsql-performance(at)postgresql(dot)org>
Subject: Re: COPY into table too slow with index: now an I/O question
Date: 2005-12-02 03:26:56
Message-ID: 00ee01c5f6f0$3faec750$0200a8c0@dell8200
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

I only have one CPU. Is my copy of iostat confused, or does this have
something to do with hyperthreading or dual core? (AFAIK, I don't have a
dual core!)

The problem (for me) with dropping the index during a copy is that it takes
tens of minutes (or more) to recreate the geometry index once the table has,
say, 50 million rows.

> -----Original Message-----
> From: Luke Lonergan [mailto:llonergan(at)greenplum(dot)com]
> Sent: Thursday, December 01, 2005 9:27 PM
> To: Rick Schumeyer; pgsql-performance(at)postgresql(dot)org
> Subject: Re: [PERFORM] COPY into table too slow with index: now an I/O
> question
>
> Rick,
>
> On 12/1/05 2:18 PM, "Rick Schumeyer" <rschumeyer(at)ieee(dot)org> wrote:
>
> > As a follow up to my own question:
> >
> > I reran the COPY both ways (with the index and without) while running
> iostat.
> > The following values
> > are averages:
> > %user %nice %sys %iowait %idle
> > no index 39 0 2.8 11 47
> > index 16 1.5 2.1 34 46
> >
> > I¹m no performance guru, so please indulge a couple of silly questions:
> >
> > 1) Why is there so much idle time? I would think the CPU would
> either be
> > busy or waiting for IO.
>
> The 100% represents 2 CPUs. When one CPU is fully busy you should see 50%
> idle time.
>
> > 2) It seems that I need to improve my disk situation. Would it
> help to
> > add another drive to my PC and
> > keep the input data on a separate drive from my pg tables? If so, some
> > pointers on the best way to set that up
> > would be appreciated.
>
> Putting the index and the table on separate disks will fix this IMO. I
> think you can do that using the "TABLESPACE" concept for each.
>
> The problem I see is nicely shown by the increase in IOWAIT between the
> two
> patterns (with and without index). It seems likely that the pattern is:
> A - insert a tuple into the table
> B - insert an entry into the index
> C - fsync the WAL
> - repeat
>
> This can be as bad as having a disk seek to access the table data every
> time
> the 8KB page boundary is crossed, then again for the index, then again for
> the WAL, and random disk seeks happen only as fast as about 10ms, so you
> can
> only do those at a rate of 100/s.
>
> > Please let me know if anyone has additional ideas.
>
> This is a fairly common problem, some people drop the index, load the
> data,
> then recreate the index to get around it.
>
> - Luke

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Craig A. James 2005-12-02 03:47:30 Re: 15,000 tables
Previous Message Tom Lane 2005-12-02 03:10:06 Re: COPY into table too slow with index: now an I/O