Re: [HACKERS] A Better External Sort?

From: "Jeffrey W(dot) Baker" <jwbaker(at)acm(dot)org>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Michael Stone <mstone+postgres(at)mathom(dot)us>, pgsql-performance(at)postgresql(dot)org
Subject: Re: [HACKERS] A Better External Sort?
Date: 2005-10-03 21:32:26
Message-ID: 1128375146.29080.9.camel@toonses.gghcwest.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

On Mon, 2005-10-03 at 14:16 -0700, Josh Berkus wrote:
> Jeff,
>
> > > Nope, LOTS of testing, at OSDL, GreenPlum and Sun. For comparison, A
> > > Big-Name Proprietary Database doesn't get much more than that either.
> >
> > I find this claim very suspicious. I get single-threaded reads in
> > excess of 1GB/sec with XFS and > 250MB/sec with ext3.
>
> Database reads? Or raw FS reads? It's not the same thing.

Just reading files off the filesystem. These are input rates I get with
a specialized sort implementation. 1GB/sec is not even especially
wonderful, I can get that on two controllers with 24-disk stripe set.

I guess database reads are different, but I remain unconvinced that they
are *fundamentally* different. After all, a tab-delimited file (my sort
workload) is a kind of database.

> Also, we're talking *write speed* here, not read speed.

Ok, I did not realize. Still you should see 250-300MB/sec
single-threaded sequential output on ext3, assuming the storage can
provide that rate.

> I also find *your* claim suspicious, since there's no way XFS is 300% faster
> than ext3 for the *general* case.

On a single disk you wouldn't notice, but XFS scales much better when
you throw disks at it. I get a 50MB/sec boost from the 24th disk,
whereas ext3 stops scaling after 16 disks. For writes both XFS and ext3
top out around 8 disks, but in this case XFS tops out at 500MB/sec while
ext3 can't break 350MB/sec.

I'm hopeful that in the future the work being done at ClusterFS will
make ext3 on-par with XFS.

-jwb

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim C. Nasby 2005-10-03 21:36:02 Re: Vacuum Full Analyze Stalled
Previous Message Simon Riggs 2005-10-03 21:32:25 Re: Vacuum Full Analyze Stalled

Browse pgsql-performance by date

  From Date Subject
Next Message Hannu Krosing 2005-10-03 21:43:10 Re: [HACKERS] A Better External Sort?
Previous Message Luke Lonergan 2005-10-03 21:28:12 Re: [HACKERS] A Better External Sort?