Re: Threaded Sorting

From: "Shridhar Daithankar" <shridhar_daithankar(at)persistent(dot)co(dot)in>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Threaded Sorting
Date: 2002-10-04 07:54:47
Message-ID: 3D9D969F.12353.A91A0@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 4 Oct 2002 at 9:46, Hans-Jürgen Schönig wrote:

> Did anybody think about threaded sorting so far?
> Assume an SMP machine. In the case of building an index or in the case
> of sorting a lot of data there is just one backend working. Therefore
> just one CPU is used.
> What about starting a thread for every temporary file being created?
> This way CREATE INDEX could use many CPUs.
> Maybe this is worth thinking about because it will speed up huge
> databases and enterprise level computing.

I have a better plan. I have a thread architecture ready which acts as generic
thread templates. Even the function pointers in the thread can be altered on
the fly.

I suggest we use some such architecture for threading. It can be used in any
module without hardcoding things. Like say in sorting we assign exclusive
jobs/data ranges to threads then there would be minimum locking and one thread
could merge the results.. Something like that.

All it takes to change entry functions to accept one more parameter that
indicates range of values to act upon. In non-threaded version, it's not there
because the function acts on entire data set.

Further more, with this model threading support can be turned off easily. In
non-threaded model, a wrapper function can call the entry point in series with
necessary arguments. So postgresql does not have to deal with not-so-good-
enough thread implementations. Keeping tradition to conservative defaults we
can set default threads to off..

The code is in C++ but it's hardly couple of pages. I can convert it to C and
post it if required..

Let me know..

Bye
Shridhar

--
Parkinson's Fourth Law: The number of people in any working group tends to
increase regardless of the amount of work to be done.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Shridhar Daithankar 2002-10-04 08:00:54 Re: [HACKERS] Large databases, performance
Previous Message Hans-Jürgen Schönig 2002-10-04 07:46:45 Threaded Sorting