Skip site navigation (1) Skip section navigation (2)

Re: optimising data load

From: Ron Johnson <ron(dot)l(dot)johnson(at)cox(dot)net>
To: PgSQL Novice ML <pgsql-novice(at)postgresql(dot)org>
Subject: Re: optimising data load
Date: 2002-05-23 02:40:07
Message-ID: 1022121607.19121.81.camel@rebel (view raw or flat)
Thread:
Lists: pgsql-novice
On Wed, 2002-05-22 at 16:05, John Taylor wrote:
> On Wednesday 22 May 2002 18:40, Ron Johnson wrote:
> > On Wed, 2002-05-22 at 08:45, John Taylor wrote:
> > > 
> > > Hi all,
> > > 
> > > I'm (still) working on an application to regularly populate my database with some data provided from
> > > a third party DB.
> > > 
> > > I'm still having really bad performance problems.
> > > There are 300,000 records to be inserted, but I'm only getting 10,000/hour.
> > > 30 hours to populate a single table is just not on. There must be something seriously bad going on.
> > 
> > From what I see of your INSERT statement, you are joining orderheader 
> > to orderlines and then inserting those records into orderlines.
> > Is that correct?
> 
> Yes, unfortunately I get incomplete data, so I need to select some details before inserting.
> 
> > 
> > I think I'd make a view from the SELECT portion.  Then, nightly:
> > 1. COPY out the view.
> > 2. Drop the index(es) on orderlines
> > 3. COPY into orderlines
> > 4. recreate indexes on orderlines.
> 
> So, how would I merge the data from the view with the data to copy in ?

???  Unless I am misunderstanding you:

0. create view v_new_orderlines (theorder, type, stock, line,
       ordercurrent, sellingquant, price, discount, vatrate, comment)
   as  SELECT oh.theorder, 'P', '0310', coalesce(ol.line+1,1), 5, 0,
              .52, 0, 0, ''
       FROM orderheader oh 
               LEFT OUTER JOIN orderlines ol 
                       ON oh.theorder = ol.theorder
       WHERE oh.account=' MILN1' 
         AND oh.delivery=1 
         AND oh.thedate='2002-06-01' 
         AND oh.ordertype='O'
       ORDER BY ol.line ;
1. COPY v_new_orderlines to '/tmp/new_orderlines.tsv';
2. DROP each index from orderlines
3. COPY orderlines from '/tmp/new_orderlines.tsv';
4. CREATE each index on orderlines


> I was thinking of generating the copy file by performing a select for each row,
> but a selecting each row into a file, and then copy in might be more expensive than
> just inserting it like I do now. 
> 
> > 
> > Do the same for orderlineupdates.  The thing is is that 
> > modifying indexes is a _very_ expensive operation, no matter
> > what RDBMS or hardware configuration you have...
> > 
> 
> Yes, I was thinking of that for my next step.
> Once I figure how to get copy working, I'll also try dropping indexes.

-- 
+---------------------------------------------------------+
| Ron Johnson, Jr.        Home: ron(dot)l(dot)johnson(at)cox(dot)net     |
| Jefferson, LA  USA      http://ronandheather.dhs.org:81 |
|                                                         |
| "I have created a government of whirled peas..."        |
|   Maharishi Mahesh Yogi, 12-May-2002,                   |
!   CNN, Larry King Live                                  |
+---------------------------------------------------------+


In response to

Responses

pgsql-novice by date

Next:From: Thomas A. LoweryDate: 2002-05-23 05:26:57
Subject: Re: pl/perl Documentation
Previous:From: Josh BerkusDate: 2002-05-23 00:36:31
Subject: Re: Moving data to different hard drive /or/ incremental backup

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group