Re: Scaling up PostgreSQL in Multiple CPU / Dual Core

From: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
To: Chris Browne <cbbrowne(at)acm(dot)org>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Scaling up PostgreSQL in Multiple CPU / Dual Core
Date: 2006-03-24 19:24:04
Message-ID: 20060324192404.GA9246@pervasive.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Fri, Mar 24, 2006 at 01:21:23PM -0500, Chris Browne wrote:
> > Correct me if I'm wrong, but there's no way to (reasonably) accomplish
> > that without having some dedicated extra processes laying around that
> > you can use to execute the queries, no? In other words, the cost of a
> > fork() during query execution would be too prohibitive...
>
> Counterexample...
>
> The sort of scenario we keep musing about is where you split off a
> (thread|process) for each partition of a big table. There is in fact
> a natural such partitioning, in that tables get split at the 1GB mark,
> by default.
>
> Consider doing a join against 2 tables that are each 8GB in size
> (e.g. - they consist of 8 data files). Let's assume that the query
> plan indicates doing seq scans on both.
>
> You *know* you'll be reading through 16 files, each 1GB in size.
> Spawning a process for each of those files doesn't strike me as
> "prohibitively expensive."

Have you ever tried reading from 2 large files on a disk at the same
time, let alone 16? The results ain't pretty.

What you're suggesting maybe makes sense if the two tables are in
different tablespaces, provided you have some additional means to know
if those two tablespaces are on the same set of spindles. Though even
here the usefulness is somewhat suspect, because CPU is a hell of a lot
faster than disks are, unless you have a whole lot of disks. Of course,
this is exactly the target market for MPP.

Where parallel execution really makes sense is when you're doing things
like sorts or hash operations, because those are relatively
CPU-intensive.
--
Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Chris Browne 2006-03-24 19:33:21 Re: Scaling up PostgreSQL in Multiple CPU / Dual Core
Previous Message Tom Lane 2006-03-24 19:23:49 Re: Performance problems with multiple layers of functions