Quick Links

Re: An idea for parallelizing COPY within one backend

From:	Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Cc:	"Florian G(dot) Pflug" <fgp(at)phlo(dot)org>
Subject:	Re: An idea for parallelizing COPY within one backend
Date:	2008-02-27 08:09:12
Message-ID:	200802270909.15336.dfontaine@hi-media.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi,

Le mercredi 27 février 2008, Florian G. Pflug a écrit :
> Upon reception of a COPY INTO command, a backend would
> .) Fork off a "dealer" and N "worker" processes that take over the
> client connection. The "dealer" distributes lines received from the
> client to the N workes, while the original backend receives them
> as tuples back from the workers.

This looks so much like what pgloader does now (version 2.3.0~dev2, release
candidate) at the client side, when configured for it, that I can't help
answering the mail :)
http://pgloader.projects.postgresql.org/dev/pgloader.1.html#_parallel_loading
section_threads = N
split_file_reading = False

Of course, the backends still have to parse the input given by pgloader, which
only pre-processes data. I'm not sure having the client prepare the data some
more (binary format or whatever) is a wise idea, as you mentionned and wrt
Tom's follow-up. But maybe I'm all wrong, so I'm all ears!

Regards,
--
dim

In response to

An idea for parallelizing COPY within one backend at 2008-02-27 02:43:11 from Florian G. Pflug

Responses

Re: An idea for parallelizing COPY within one backend at 2008-02-27 10:47:29 from Simon Riggs
Re: An idea for parallelizing COPY within one backend at 2008-02-27 14:11:10 from Florian G. Pflug

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Simon Riggs	2008-02-27 08:09:27	Re: pg_dump additional options for performance
Previous Message	Hiroshi Saito	2008-02-27 07:13:56	Re: pgsql: Don't build the win32 support files in the all target, only in