Re: Parallel query execution

From: Claudio Freire <klaussfreire(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel query execution
Date: 2013-01-16 15:23:14
Message-ID: CAGTBQpbmHdr3Ev3K9wcsepNUR3Fu7gapvibr3AjbfEQiDGT_qQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 16, 2013 at 10:33 AM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> * Claudio Freire (klaussfreire(at)gmail(dot)com) wrote:
>> Well, there's the fault in your logic. It won't be as linear.
>
> I really don't see how this has become so difficult to communicate.
>
> It doesn't have to be linear.
>
> We're currently doing massive amounts of parallel processing by hand
> using partitioning, tablespaces, and client-side logic to split up the
> jobs. It's certainly *much* faster than doing it in a single thread.
> It's also faster with 10 processes going than 5 (we've checked). With
> 10 going, we've hit the FC fabric limit (and these are spinning disks in
> the SAN, not SSDs). I'm also sure it'd be much slower if all 10
> processes were trying to read data through a single process that's
> reading from the I/O system. We've got some processes which essentially
> end up doing that and we don't come anywhere near the total FC fabric
> bandwidth when just scanning through the system because, at that point,
> you do hit the limits of how fast the individual drive sets can provide
> data.

Well... just closing then (to let people focus on 9.3's CF), that's a
level of hardware I haven't had experience with, but seems to behave
much different than regular (big and small) RAID arrays.

In any case, perhaps tablespaces are a hint here: if nodes are working
on different tablespaces, there's an indication that they *can* be
parallelized efficiently. That could be fleshed out on a "parallel
execution" node, but for that to work the whole execution engine needs
to be thread-safe (or it has to fork). It won't be easy.

It's best to concentrate on lower-hanging fruits, like sorting and aggregates.

Now back to the CF.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2013-01-16 15:26:00 Re: [sepgsql 1/3] add name qualified creation label
Previous Message Peter Eisentraut 2013-01-16 15:11:23 Re: Re: [PATCH] Compile without warning with gcc's -Wtype-limits, -Wempty-body