Re: An idea for parallelizing COPY within one backend

From: "Florian G(dot) Pflug" <fgp(at)phlo(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Brian Hurt <bhurt(at)janestcapital(dot)com>, Postgresql-Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: An idea for parallelizing COPY within one backend
Date: 2008-02-27 17:47:26
Message-ID: 47C5A22E.7000301@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:
> "Florian G. Pflug" <fgp(at)phlo(dot)org> writes:
>> Plus, I'd see this as a kind of testbed for gently introducing
>> parallelism into postgres backends (especially thinking about sorting
>> here).
>
> This thinking is exactly what makes me scream loudly and run in the
> other direction. I don't want threads introduced into the backend,
> whether "gently" or otherwise. The portability and reliability hits
> that we'll take are too daunting. Threads that invoke user-defined
> code (as anything involved with datatype-specific operations must)
> are especially fearsome, as there is precisely 0 chance of that code
> being thread-safe.

Exactly my thinking. That is why I was looking for a way to introduce
parallelism *without* threading. Though it's not so much the
user-defined code that scares me, but rather the portability issues. The
differences between NPTL and non-NPTL threads on linux alone make me
shudder...

Was I was saying is that there might be a chance to get some parallelism
without threading, by executing well-defined pieces of code with
controlled dependencies in separate processes. COPY seemed like an ideal
testbed for that idea, since the conversion of received lines into
tuples seemed reasonable self-contained, and with little outside
dependencies. If the idea can't be made to work there, it probably won't
work anywhere. If it turns out that it does (with an API change for
input/output functions) however, then it *might* be possible to apply it
to other relatively self-contained parts in the future...

To restate, I don't want threaded backends. Not in the foreseeable
future at least. But I'd still love to see a single transaction using
more than one core.

regards, Florian Pflug

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2008-02-27 17:53:09 Re: ResourceOwners for Snapshots? holdable portals
Previous Message Tom Lane 2008-02-27 17:11:32 Re: An idea for parallelizing COPY within one backend