Re: [HACKERS] A Better External Sort?

From: "Luke Lonergan" <llonergan(at)greenplum(dot)com>
To: "Michael Stone" <mstone+postgres(at)mathom(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, pgsql-performance(at)postgresql(dot)org
Subject: Re: [HACKERS] A Better External Sort?
Date: 2005-10-05 23:55:51
Message-ID: BF69B617.10B72%llonergan@greenplum.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

Michael,

On 10/5/05 8:33 AM, "Michael Stone" <mstone+postgres(at)mathom(dot)us> wrote:

> real 0m8.889s
> user 0m0.877s
> sys 0m8.010s
>
> it's not in disk wait state (in fact the whole read was cached) but it's
> only getting 1MB/s.

You've proven my point completely. This process is bottlenecked in the CPU.
The only way to improve it would be to optimize the system (libc) functions
like "fread" where it is spending most of it's time.

In COPY, we found lots of libc functions like strlen() being called
ridiculous numbers of times, in one case it was called on every
timestamp/date attribute to get the length of TZ, which is constant. That
one function call was in the system category, and was responsible for
several percent of the time.

By the way, system routines like fgetc/getc/strlen/atoi etc, don't appear in
gprof profiles of dynamic linked objects, nor by default in oprofile
results.

If the bottleneck is in I/O, you will see the time spent in disk wait, not
in system.

- Luke

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Steinar H. Gunderson 2005-10-06 00:12:39 Re: [HACKERS] A Better External Sort?
Previous Message Ron Peacetree 2005-10-05 23:54:15 Re: [HACKERS] A Better External Sort?

Browse pgsql-performance by date

  From Date Subject
Next Message Steinar H. Gunderson 2005-10-06 00:12:39 Re: [HACKERS] A Better External Sort?
Previous Message Ron Peacetree 2005-10-05 23:54:15 Re: [HACKERS] A Better External Sort?