Re: [HACKERS] putting CHECK_FOR_INTERRUPTS in qsort_comparetup()

From: "Charles Duffy" <charles(dot)duffy(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Peter Eisentraut" <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org, "Qingqing Zhou" <zhouqq(at)cs(dot)toronto(dot)edu>, pgsql-patches(at)postgresql(dot)org
Subject: Re: [HACKERS] putting CHECK_FOR_INTERRUPTS in qsort_comparetup()
Date: 2006-07-28 07:39:02
Message-ID: dfdaea8f0607280039g3b9eb7e2ra74590ca160e229f@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

On 7/15/06, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Anyway, Qingqing's question still needs to be answered: how can a sort
> of under 30k items take so long?
>

It happens because (as previously suggested by Tom) the dataset for
the 'short' (~10k rows, .3 sec) sort has no rows whose leftmost fields
evaluate to 'equal' when passed to the qsort compare function. The
'long' sort, (~30k rows, 78 sec) has plenty of rows whose first 6
columns all evaluate as 'equal' when the rows are compared.

For the 'long' data, the compare moves on rightward until it
encounters 'flato', which is a TEXT column with an average length of
7.5k characters (with some rows up to 400k). The first 6 columns are
mostly INTEGER, so compares on them are relatively inexpensive. All
the expensive compares on 'flato' account for the disproportionate
difference in sort times, relative to the number of rows in each set.

As for the potential for memory leaks - thinking about it.

Thanks,

Charles Duffy.

> Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
> > The merge sort is here:
>
> > http://sourceware.org/cgi-bin/cvsweb.cgi/libc/stdlib/msort.c?rev=1.21&content-type=text/x-cvsweb-markup&cvsroot=glibc
>
> > It uses alloca, so we're good here.
>
> Uh ... but it also uses malloc, and potentially a honkin' big malloc at
> that (up to a quarter of physical RAM). So I'm worried again.
>
> Anyway, Qingqing's question still needs to be answered: how can a sort
> of under 30k items take so long?
>
> regards, tom lane
>

Attachment Content-Type Size
NG_viewdef.txt text/plain 527 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Chris Mair 2006-07-28 08:03:21 selecting large result sets in psql using cursors
Previous Message Hannu Krosing 2006-07-28 07:36:04 Re: The vacuum-ignore-vacuum patch

Browse pgsql-patches by date

  From Date Subject
Next Message Chris Mair 2006-07-28 08:03:21 selecting large result sets in psql using cursors
Previous Message Hannu Krosing 2006-07-28 07:36:04 Re: The vacuum-ignore-vacuum patch