From: | Simon Riggs <simon(at)2ndquadrant(dot)com> |
---|---|
To: | Dimitri Fontaine <dfontaine(at)hi-media(dot)com> |
Cc: | pgsql-performance(at)postgresql(dot)org, "Jignesh K(dot) Shah" <J(dot)K(dot)Shah(at)sun(dot)com>, Greg Smith <gsmith(at)gregsmith(dot)com> |
Subject: | Re: Benchmark Data requested --- pgloader CE design ideas |
Date: | 2008-02-06 11:45:24 |
Message-ID: | 1202298324.4252.965.camel@ebony.site |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On Wed, 2008-02-06 at 12:27 +0100, Dimitri Fontaine wrote:
> Multi-Threading behavior and CE support
> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>
> Now, pgloader will be able to run N threads, each one loading some
> data to a
> partitionned child-table target. N will certainly be configured
> depending on
> the number of server cores and not depending on the partition
> numbers...
>
> So what do we do when reading a tuple we want to store in a partition
> which
> has no dedicated Thread started yet, and we already have N Threads
> running?
> I'm thinking about some LRU(Thread) to choose a Thread to terminate
> (launch
> COPY with current buffer and quit) and start a new one for the
> current
> partition target.
> Hopefully there won't be such high values of N that the LRU is a bad
> choice
> per see, and the input data won't be so messy to have to stop/start
> Threads
> at each new line.
For me, it would be good to see a --parallel=n parameter that would
allow pg_loader to distribute rows in "round-robin" manner to "n"
different concurrent COPY statements. i.e. a non-routing version. Making
that work well, whilst continuing to do error-handling seems like a
challenge, but a very useful goal.
Adding intelligence to the row distribution may be technically hard but
may also simply move the bottleneck onto pg_loader. We may need multiple
threads in pg_loader, or we may just need multiple sessions from
pg_loader. Experience from doing the non-routing parallel version may
help in deciding whether to go for the routing version.
--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com
From | Date | Subject | |
---|---|---|---|
Next Message | Simon Riggs | 2008-02-06 11:53:46 | Re: Optimizer : query rewrite and execution plan ? |
Previous Message | Dimitri Fontaine | 2008-02-06 11:27:56 | Re: Benchmark Data requested --- pgloader CE design ideas |