From: | Simon Riggs <simon(at)2ndquadrant(dot)com> |
---|---|
To: | ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Batch update of indexes on data loading |
Date: | 2008-02-24 08:14:24 |
Message-ID: | 1203840864.4247.2.camel@ebony.site |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, 2008-02-21 at 13:26 +0900, ITAGAKI Takahiro wrote:
> This is a proposal of fast data loading using batch update of indexes for 8.4.
> It is a part of pg_bulkload (http://pgbulkload.projects.postgresql.org/) and
> I'd like to integrate it in order to cooperate with other parts of postgres.
>
> The basic concept is spooling new coming data, and merge the spool and
> the existing indexes into a new index at the end of data loading. It is
> 5-10 times faster than index insertion per-row, that is the way in 8.3.
>
>
> One of the problem is locking; Index building in bulkload is similar to
> REINDEX rather than INSERT, so we need ACCESS EXCLUSIVE LOCK during it.
> Bulkloading is not a upper compatible method, so I'm thinking about
> adding a new "WITH LOCK" option for COPY command.
>
> COPY tbl FROM 'datafile' WITH LOCK;
>
I'm very excited to see these concepts going into COPY.
One of the reasons why I hadn't wanted to pursue earlier ideas to use
LOCK was that applying a lock will prevent running in parallel, which
ultimately may prevent further performance gains.
Is there a way of doing this that will allow multiple concurrent COPYs?
--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com
From | Date | Subject | |
---|---|---|---|
Next Message | Heikki Linnakangas | 2008-02-24 09:39:42 | Re: 8.3 / 8.2.6 restore comparison |
Previous Message | Luke Lonergan | 2008-02-24 01:46:40 | Re: CopyReadLineText optimization |