Re: optimising data load

From: Rasmus Mohr <rmo(at)Netpointers(dot)com>
To: 'John Taylor' <postgres(at)jtresponse(dot)co(dot)uk>, "'pgsql-novice(at)postgresql(dot)org'" <pgsql-novice(at)postgresql(dot)org>
Subject: Re: optimising data load
Date: 2002-05-22 14:29:50
Message-ID: 910513A5A944D5118BE900C04F67CB5A0BFDB2@MAIL
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

I think something like setting fsync = off in postgresql.conf and dropping
indeces should boost performance. I think it did help when we had a similar
problem populating our database with data from an old access database.

--------------------------------------------------------------
Rasmus T. Mohr Direct : +45 36 910 122
Application Developer Mobile : +45 28 731 827
Netpointers Intl. ApS Phone : +45 70 117 117
Vestergade 18 B Fax : +45 70 115 115
1456 Copenhagen K Email : mailto:rmo(at)netpointers(dot)com
Denmark Website : http://www.netpointers.com

"Remember that there are no bugs, only undocumented features."
--------------------------------------------------------------

> -----Original Message-----
> From: pgsql-novice-owner(at)postgresql(dot)org
> [mailto:pgsql-novice-owner(at)postgresql(dot)org]On Behalf Of John Taylor
> Sent: Wednesday, May 22, 2002 3:46 PM
> To: pgsql-novice(at)postgresql(dot)org
> Subject: [NOVICE] optimising data load
>
>
>
> Hi all,
>
> I'm (still) working on an application to regularly populate
> my database with some data provided from
> a third party DB.
>
> I'm still having really bad performance problems.
> There are 300,000 records to be inserted, but I'm only
> getting 10,000/hour.
> 30 hours to populate a single table is just not on. There
> must be something seriously bad going on.
>
> I have to update to different databases: live, and an update log.
>
> I've included explains for both the db's below.
>
> I have a few questions:
>
> 1) Can it really be right that it is going this slow ?
> I'm running Redhat Linux with kernel 2.4.7, and postgres 7.1.3
>
> 2) Can anyone see a way to optimise these queries further ?
>
> 3) Is it likely to be quicker if I use a query to obtain
> ol.line, and then generate a CSV file for use with COPY ?
>
> Thanks
> JohnT
>
> --- LIVE ---
> explain INSERT INTO orderlines
> (theorder,type,stock,line,ordercurrent,sellingquant,price,disc
> ount,vatrate,comment)
> SELECT oh.theorder,'P','
> 0310',coalesce(ol.line+1,1),5,0,.52,0,0,'' FROM orderheader oh
> LEFT OUTER JOIN orderlines ol ON oh.theorder = ol.theorder
> WHERE oh.account=' MILN1' AND oh.delivery=1 AND
> oh.thedate='2002-06-01' AND oh.ordertype='O'
> ORDER BY ol.line DESC LIMIT 1;
> NOTICE: QUERY PLAN:
>
> Subquery Scan *SELECT* (cost=47.41..47.41 rows=1 width=12)
> -> Limit (cost=47.41..47.41 rows=1 width=12)
> -> Sort (cost=47.41..47.41 rows=1 width=12)
> -> Nested Loop (cost=0.00..47.40 rows=1 width=12)
> -> Index Scan using orderheader_account
> on orderheader oh (cost=0.00..21.64 rows=1 width=4)
> -> Index Scan using orderlines_pkey on
> orderlines ol (cost=0.00..25.54 rows=17 width=8)
>
> EXPLAIN
>
> --- UPDATE LOG ---
> explain INSERT INTO orderlinesupdates
> (theorder,type,stock,line,ordercurrent,sellingquant,price,disc
> ount,vatrate,comment,updated,utype,origin)
> SELECT oh.theorder,'P','
> 0310',coalesce(ol.line,ol2.line+1,1),5,0,.52,0,0,'',128,'+','C
> ' FROM orderheaderupdates oh
> LEFT OUTER JOIN orderlinesupdates ol ON oh.theorder =
> ol.theorder AND ol.stock=' 0310'
> LEFT OUTER JOIN orderlinesupdates ol2 ON oh.theorder = ol2.theorder
> WHERE oh.account=' MILN1' AND oh.delivery=1 AND
> oh.thedate='2002-06-01' AND oh.ordertype='O'
> ORDER BY oh.updated DESC, ol.line DESC, ol2.line DESC LIMIT 1;
> NOTICE: QUERY PLAN:
>
> Subquery Scan *SELECT* (cost=81.29..81.29 rows=1 width=36)
> -> Limit (cost=81.29..81.29 rows=1 width=36)
> -> Sort (cost=81.29..81.29 rows=1 width=36)
> -> Nested Loop (cost=0.00..81.28 rows=1 width=36)
> -> Nested Loop (cost=0.00..52.47 rows=1
> width=28)
> -> Index Scan using
> orderheaderupdates_account on orderheaderupdates oh
> (cost=0.00..23.62 rows=1 width=8)
> -> Index Scan using
> orderlinesupdates_theorder on orderlinesupdates ol
> (cost=0.00..28.60 rows=17 width=20)
> -> Index Scan using
> orderlinesupdates_theorder on orderlinesupdates ol2
> (cost=0.00..28.60 rows=17 width=8)
>
> EXPLAIN
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
> (send "unregister YourEmailAddressHere" to
> majordomo(at)postgresql(dot)org)
>
>

Browse pgsql-novice by date

  From Date Subject
Next Message Kevin Sterner 2002-05-22 15:14:08 Re: Undead record haunts my database, need exorcism
Previous Message Tom Lane 2002-05-22 14:24:48 Re: pl/perl Documentation