|From:||Stephen Frost <sfrost(at)snowman(dot)net>|
|To:||Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz>|
|Cc:||Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>|
|Subject:||Re: Parallel query execution|
|Views:||Raw Message | Whole Thread | Download mbox | Resend email|
* Gavin Flower (GavinFlower(at)archidevsys(dot)co(dot)nz) wrote:
> How about being aware of multiple spindles - so if the requested
> data covers multiple spindles, then data could be extracted in
> parallel. This may, or may not, involve multiple I/O channels?
Yes, this should dovetail with partitioning and tablespaces to pick up
on exactly that. We're implementing our own poor-man's parallelism
using exactly this to use as much of the CPU and I/O bandwidth as we
can. I have every confidence that it could be done better and be
simpler for us if it was handled in the backend.
> On large multiple processor machines, there are different blocks of
> memory that might be accessed at different speeds depending on the
> processor. Possibly a mechanism could be used to split a transaction
> over multiple processors to ensure the fastest memory is used?
Let's work on getting it working on the h/w that PG is most commonly
deployed on first.. I agree that we don't want to paint ourselves into
a corner with this, but I don't think massive NUMA systems are what we
should focus on first (are you familiar with any that run PG today..?).
I don't expect we're going to be trying to fight with the Linux (or
whatever) kernel over what threads run on what processors with access to
what memory on small-NUMA systems (x86-based).
> Once a selection of rows has been made, then if there is a lot of
> reformatting going on, then could this be done in parallel? I can
> of think of 2 very simplistic strategies: (A) use a different
> processor core for each column, or (B) farm out sets of rows to
> different cores. I am sure in reality, there are more subtleties
> and aspects of both the strategies will be used in a hybrid fashion
> along with other approaches.
Given our row-based storage architecture, I can't imagine we'd do
anything other than take a row-based approach to this.. I would think
we'd do two things: parallelize based on partitioning, and parallelize
seqscan's across the individual heap files which are split on a per-1G
boundary already. Perhaps we can generalize that and scale it based on
the number of available processors and the size of the relation but I
could see advantages in matching up with what the kernel thinks are
> I expect that before any parallel algorithm is invoked, then some
> sort of threshold needs to be exceeded to make it worth while.
Certainly. That's need to be included in the optimization model to
|Next Message||Tom Lane||2013-01-15 23:17:05||Re: [PATCH] COPY .. COMPRESSED|
|Previous Message||Bruce Momjian||2013-01-15 23:08:47||Re: Parallel query execution|