Re: Parallel query execution

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Claudio Freire <klaussfreire(at)gmail(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel query execution
Date: 2013-01-16 03:13:33
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Claudio Freire (klaussfreire(at)gmail(dot)com) wrote:
> On Tue, Jan 15, 2013 at 8:19 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> > The 1GB idea is interesting. I found in pg_upgrade that file copy would
> > just overwhelm the I/O channel, and that doing multiple copies on the
> > same device had no win, but those were pure I/O operations --- a
> > sequential scan might be enough of a mix of I/O and CPU that parallelism
> > might help.
> AFAIR, synchroscans were introduced because multiple large sequential
> scans were counterproductive (big time).

Sequentially scanning the *same* data over and over is certainly
counterprouctive. Synchroscans fixed that, yes. That's not what we're
talking about though- we're talking about scanning and processing
independent sets of data using multiple processes. It's certainly
possible that in some cases that won't be as good, but there will be
quite a few cases where it's much, much better.

Consider a very complicated function running against each row which
makes the CPU the bottleneck instead of the i/o system. That type of a
query will never run faster than a single CPU in a single-process
environment, regardless of if you have synch-scans or not, while in a
multi-process environment you'll take advantage of the extra CPUs which
are available and use more of the I/O bandwidth that isn't yet



In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Claudio Freire 2013-01-16 03:16:56 Re: Parallel query execution
Previous Message Peter Eisentraut 2013-01-16 03:01:56 Re: transforms