Skip site navigation (1) Skip section navigation (2)

Re: [PERFORM] A Better External Sort?

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Ron Peacetree <rjpeace(at)earthlink(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org, pgsql-performance(at)postgresql(dot)org
Subject: Re: [PERFORM] A Better External Sort?
Date: 2005-09-30 20:41:22
Message-ID: 200509301341.22795.josh@agliodbs.com (view raw or flat)
Thread:
Lists: pgsql-hackerspgsql-performance
Ron,

> That 11MBps was your =bulk load= speed.  If just loading a table
> is this slow, then there are issues with basic physical IO, not just
> IO during sort operations.

Oh, yeah.  Well, that's separate from sort.  See multiple posts on this 
list from the GreenPlum team, the COPY patch for 8.1, etc.  We've been 
concerned about I/O for a while.  

Realistically, you can't do better than about 25MB/s on a single-threaded 
I/O on current Linux machines, because your bottleneck isn't the actual 
disk I/O.   It's CPU.   Databases which "go faster" than this are all, to 
my knowledge, using multi-threaded disk I/O.

(and I'd be thrilled to get a consistent 25mb/s on PostgreSQL, but that's 
another thread ... )

> As I said, the obvious candidates are inefficient physical layout
> and/or flawed IO code.

Yeah, that's what I thought too.   But try sorting an 10GB table, and 
you'll see: disk I/O is practically idle, while CPU averages 90%+.   We're 
CPU-bound, because sort is being really inefficient about something. I 
just don't know what yet.

If we move that CPU-binding to a higher level of performance, then we can 
start looking at things like async I/O, O_Direct, pre-allocation etc. that 
will give us incremental improvements.   But what we need now is a 5-10x 
improvement and that's somewhere in the algorithms or the code.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

In response to

Responses

pgsql-performance by date

Next:From: Tony WassonDate: 2005-09-30 20:57:16
Subject: Re: Monitoring Postgresql performance
Previous:From: Luke LonerganDate: 2005-09-30 20:38:54
Subject: Re: [PERFORM] A Better External Sort?

pgsql-hackers by date

Next:From: Bill BartlettDate: 2005-09-30 20:48:11
Subject: Request for a "force interactive mode" flag (-I) for psql
Previous:From: Alvaro HerreraDate: 2005-09-30 20:40:48
Subject: Re: FW: PGBuildfarm member snake Branch HEAD Status changed from OK to Contrib failure

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group