Re: [PERFORM] A Better External Sort?

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: "Jeffrey W(dot) Baker" <jwbaker(at)acm(dot)org>
Cc: Ron Peacetree <rjpeace(at)earthlink(dot)net>, pgsql-hackers(at)postgresql(dot)org, pgsql-performance(at)postgresql(dot)org
Subject: Re: [PERFORM] A Better External Sort?
Date: 2005-09-29 16:54:05
Message-ID: 433C1C2D.7090908@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

Jeff, Ron,

First off, Jeff, please take it easy. We're discussing 8.2 features at
this point and there's no reason to get stressed out at Ron. You can
get plenty stressed out when 8.2 is near feature freeze. ;-)

Regarding use cases for better sorts:

The biggest single area where I see PostgreSQL external sort sucking is
on index creation on large tables. For example, for free version of
TPCH, it takes only 1.5 hours to load a 60GB Lineitem table on OSDL's
hardware, but over 3 hours to create each index on that table. This
means that over all our load into TPCH takes 4 times as long to create
the indexes as it did to bulk load the data.

Anyone restoring a large database from pg_dump is in the same situation.
Even worse, if you have to create a new index on a large table on a
production database in use, because the I/O from the index creation
swamps everything.

Following an index creation, we see that 95% of the time required is the
external sort, which averages 2mb/s. This is with seperate drives for
the WAL, the pg_tmp, the table and the index. I've confirmed that
increasing work_mem beyond a small minimum (around 128mb) had no benefit
on the overall index creation speed.

--Josh Berkus

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Luke Lonergan 2005-09-29 17:06:52 Re: [PERFORM] A Better External Sort?
Previous Message Josh Berkus 2005-09-29 16:35:30 Re: effective SELECT from child tables

Browse pgsql-performance by date

  From Date Subject
Next Message Luke Lonergan 2005-09-29 17:06:52 Re: [PERFORM] A Better External Sort?
Previous Message PFC 2005-09-29 16:17:05 Re: Comparative performance