Re: [HACKERS] A Better External Sort?

From: Ron Peacetree <rjpeace(at)earthlink(dot)net>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>, pgsql-hackers(at)postgresql(dot)org, pgsql-performance(at)postgresql(dot)org
Subject: Re: [HACKERS] A Better External Sort?
Date: 2005-10-01 17:42:32
Message-ID: 18329659.1128188552696.JavaMail.root@elwamui-polski.atl.sa.earthlink.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

You have not said anything about what HW, OS version, and pg version
used here, but even at that can't you see that something Smells Wrong?

The most common CPUs currently shipping have clock rates of ~2-3GHz
and have 8B-16B internal pathways. SPARCs and other like CPUs are
clocked slower but have 16B-32B internal pathways. In short, these
CPU's have an internal bandwidth of 16+ GBps.

The most common currently shipping mainboards have 6.4GBps RAM
subsystems. ITRW, their peak is ~80% of that, or ~5.1GBps.

In contrast, the absolute peak bandwidth of a 133MHx 8B PCI-X bus is
1GBps, and ITRW it peaks at ~800-850MBps. Should anyone ever build
a RAID system that can saturate a PCI-Ex16 bus, that system will be
maxing ITRW at ~3.2GBps.

CPUs should NEVER be 100% utilized during copy IO. They should be
idling impatiently waiting for the next piece of data to finish being
processed even when the RAM IO subsystem is pegged; and they
definitely should be IO starved rather than CPU bound when doing
HD IO.

Those IO rates are also alarming in all but possibly the first case. A
single ~50MBps HD doing 21MBps isn't bad, but for even a single
~80MBps HD it starts to be of concern. If any these IO rates came
from any reasonable 300+MBps RAID array, then they are BAD.

What your simple experiment really does is prove We Have A
Problem (tm) with our IO code at either or both of the OS or the pg
level(s).

Ron


-----Original Message-----
From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Sent: Oct 1, 2005 12:19 PM
Subject: Re: [HACKERS] [PERFORM] A Better External Sort?

On Sat, Oct 01, 2005 at 10:22:40AM -0400, Ron Peacetree wrote:
> Assuming we get the abyssmal physical IO performance fixed...
> (because until we do, _nothing_ is going to help us as much)

I'm still not convinced this is the major problem. For example, in my
totally unscientific tests on an oldish machine I have here:

Direct filesystem copy to /dev/null
21MB/s 10% user 50% system (dual cpu, so the system is using a whole CPU)

COPY TO /dev/null WITH binary
13MB/s 55% user 45% system (ergo, CPU bound)

COPY TO /dev/null
4.4MB/s 60% user 40% system

\copy to /dev/null in psql
6.5MB/s 60% user 40% system

This machine is a bit strange setup, not sure why fs copy is so slow.
As to why \copy is faster than COPY, I have no idea, but it is
repeatable. And actually turning the tuples into a printable format is
the most expensive. But it does point out that the whole process is
probably CPU bound more than anything else.

So, I don't think physical I/O is the problem. It's something further
up the call tree. I wouldn't be surprised at all it it had to do with
the creation and destruction of tuples. The cost of comparing tuples
should not be underestimated.

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2005-10-01 18:00:12 Re: \d on database with a lot of tables is slow
Previous Message Tom Lane 2005-10-01 17:16:38 Re: Expression index ignores column statistics target

Browse pgsql-performance by date

  From Date Subject
Next Message Roger Hand 2005-10-01 19:51:08 Re: [HACKERS] Query in SQL statement
Previous Message Ron Peacetree 2005-10-01 16:38:55 Re: [HACKERS] A Better External Sort?