Skip site navigation (1) Skip section navigation (2)

Re: COPY FROM performance improvements

From: "Andrew Dunstan" <andrew(at)dunslane(dot)net>
To: <llonergan(at)greenplum(dot)com>
Cc: <agoldshuv(at)greenplum(dot)com>, <pgsql-patches(at)postgresql(dot)org>
Subject: Re: COPY FROM performance improvements
Date: 2005-06-25 10:45:13
Message-ID: 1688.24.211.165.134.1119696313.squirrel@www.dunslane.net (view raw or flat)
Thread:
Lists: pgsql-patches
Luke Lonergan said:
> I've attached Alon's patch ported to the CVS trunk.  It applies cleanly
> and passes the regressions.  With fsync=false it is 40% faster loading
> a sample dataset with 15 columns of varied type.  It's 19% faster with
> fsync=true.
>
> This patch separates the CopyFrom code into two pieces, the new logic
> for delimited data and the existing logic for CSV and Binary.
>


A few of quick comments - I will probably have many more later when I have
time to review this in depth.

1. Postgres does context diffs for patches, not unidiffs.

2. This comment raises a flag in my mind:

+ * each attribute begins. If a specific attribute is not used for this
+ * COPY command (ommitted from the column list), a value of 0 will be
assigned.+ * For example: for table foo(a,b,c,d,e) and COPY foo(a,b,e)
+ * attr_offsets may look something like this after this routine
+ * returns: [0,20,0,0,55]. That means that column "a" value starts
+ * at byte offset 0, "b" in 20 and "e" in 55, in attr_bytebuf.

Would it not be better to mark missing attributes with something that can't
be a valid offset, like -1?


3. This comment needs improving:

+/*
+ * Copy FROM file to relation with faster processing.
+ */

4. We should indeed do this for CSV, especially since a lot of the relevant
logic for detecting attribute starts is already there for CSV in
CopyReadLine. I'm prepared to help you do that if necessary, since I'm
guilty of perpetrating that code.

cheers

andrew



In response to

Responses

pgsql-patches by date

Next:From: Andrew DunstanDate: 2005-06-25 12:41:49
Subject: Re: plperl features
Previous:From: Peter EisentrautDate: 2005-06-25 09:29:19
Subject: Re: Add PG version number to NLS files

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group