Re: 8.4 open item: copy performance regression?

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Smith <gsmith(at)gregsmith(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Alan Li <ali(at)truviso(dot)com>
Subject: Re: 8.4 open item: copy performance regression?
Date: 2009-06-21 15:16:35
Message-ID: 603c8f070906210816x10cfca35m6ceb5aa05f56b7dd@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Jun 21, 2009 at 6:48 AM, Stefan
Kaltenbrunner<stefan(at)kaltenbrunner(dot)cc> wrote:
> So I do think that IO is in fact not too significant for this kind of
> testing and we still have ways to go in terms of CPU efficiency in COPY.

It would be interesting to see some gprof or oprofile output from that
test. I went back and dug up the results that I got when I profiled
this patch during initial development, and my version of the patch
applied, the profile looked like this on my system:

% cumulative self self total
time seconds seconds calls s/call s/call name
14.48 0.85 0.85 1 0.85 5.47 DoCopy
10.05 1.44 0.59 10000001 0.00 0.00 CopyReadLine
5.62 1.77 0.33 10000039 0.00 0.00 PageAddItem
5.11 2.07 0.30 10400378 0.00 0.00 LWLockRelease
4.68 2.35 0.28 10000013 0.00 0.00 heap_insert
4.34 2.60 0.26 10000012 0.00 0.00 heap_formtuple
3.83 2.83 0.23 10356158 0.00 0.00 LWLockAcquire
3.83 3.05 0.23 10000054 0.00 0.00 MarkBufferDirty
3.32 3.25 0.20 10000013 0.00 0.00 RelationGetBufferForTuple
3.07 3.43 0.18 10000005 0.00 0.00 pg_verify_mbstr_len
2.90 3.60 0.17 10000002 0.00 0.00 CopyGetData
2.73 3.76 0.16 20000030 0.00 0.00 enlargeStringInfo
2.73 3.92 0.16 20000014 0.00 0.00 pq_getbytes
2.04 4.04 0.12 10000000 0.00 0.00 InputFunctionCall

...but this might not be very representative, since I think I may have
tested it on a single-column table. It would be interesting to see
some other results.

Simon had the idea of further improving performance by keeping the
current buffer locked (this patch just kept it pinned, but not
locked), but I didn't see an obvious clean design for that. Heikki
also had a patch for speeding up copy, but it got dropped for 8.4 due
to time constraints.

...Robert

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2009-06-21 15:31:54 Re: 8.4 open item: copy performance regression?
Previous Message rct682 2009-06-21 10:54:57 enquery for timezone GMT