A little COPY speedup

From: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
To: Patches <pgsql-patches(at)postgresql(dot)org>
Subject: A little COPY speedup
Date: 2007-03-01 17:01:17
Message-ID: 45E706DD.6080404@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches

One complaint we've heard from clients trying out EDB or PostgreSQL is
that loading data is slower than on other DBMSs.

I ran oprofile on a COPY FROM to get an overview of where the CPU time
is spent. To my amazement, the function at the top of the list was
PageAddItem with 16% of samples.

On every row, PageAddItem will scan all the line pointers on the target
page, just to see that they're all in use, and create a new line
pointer. That adds up, especially with narrow tuples like what I used in
the test.

Attached is a fix for that. It adds a flag to each heap page that
indicates that "there isn't any free line pointers on this page, so
don't bother trying". Heap pages haven't had any heap-specific per-page
data before, so this patch adds a HeapPageOpaqueData-struct that's
stored in the special space.

My simple test case of a COPY FROM of 10000000 tuples took 19.6 s
without the patch, and 17.7 s with the patch applied. Your mileage may vary.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

Attachment Content-Type Size
nofreelinepointers-1.patch text/x-patch 8.4 KB

Responses

Browse pgsql-patches by date

  From Date Subject
Next Message Pavan Deolasee 2007-03-01 17:10:07 Re: A little COPY speedup
Previous Message Tom Lane 2007-03-01 15:52:44 Re: Fast COPY after TRUNCATE bug and fix