Re: A little COPY speedup

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: A little COPY speedup
Date: 2007-03-01 17:27:36
Message-ID: 45E70D08.2030200@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches

Heikki Linnakangas wrote:
> One complaint we've heard from clients trying out EDB or PostgreSQL is
> that loading data is slower than on other DBMSs.
>
> I ran oprofile on a COPY FROM to get an overview of where the CPU time
> is spent. To my amazement, the function at the top of the list was
> PageAddItem with 16% of samples.
>
> On every row, PageAddItem will scan all the line pointers on the
> target page, just to see that they're all in use, and create a new
> line pointer. That adds up, especially with narrow tuples like what I
> used in the test.
>
> Attached is a fix for that. It adds a flag to each heap page that
> indicates that "there isn't any free line pointers on this page, so
> don't bother trying". Heap pages haven't had any heap-specific
> per-page data before, so this patch adds a HeapPageOpaqueData-struct
> that's stored in the special space.
>
> My simple test case of a COPY FROM of 10000000 tuples took 19.6 s
> without the patch, and 17.7 s with the patch applied. Your mileage may
> vary.

What is the speedup with less narrow tuples? 10% improvement is good but
not stellar.

cheers

andrew

In response to

Browse pgsql-patches by date

  From Date Subject
Next Message Tom Lane 2007-03-01 19:05:31 Re: A little COPY speedup
Previous Message Pavan Deolasee 2007-03-01 17:10:07 Re: A little COPY speedup