Skip site navigation (1) Skip section navigation (2)

Threads

From: Shridhar Daithankar <shridhar_daithankar(at)persistent(dot)co(dot)in>
To: PGHackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Threads
Date: 2003-01-03 15:24:11
Message-ID: 200301032054.11125.shridhar_daithankar@persistent.co.in (view raw or flat)
Thread:
Lists: pgsql-hackers
Hi all,

I am sure, many of you would like to delete this message before reading, hold 
on. :-)

There is much talk about threading on this list and the idea is always 
deferred for want of robust thread models across all supported platforms and 
feasibility of gains v/s efforts required.

I think threads are useful in difference situations namely parallelising 
blocking conditions and using multiple CPUs.

Attached is a framework that I ported to C from a C++ server I have written. 
It has threadpool and threads implementation based on pthreads.

This code expects minimum pthreads implementation and does not assume anything 
on threads part (e.g kernel threads or not etc.)

I request hackers on this list to take a look at it. It should be easily 
pluggable in any source code and is released without any strings for any use.

This framework allows to plug-in the worker function and argument on the fly. 
The threads created are sleeping by default and can be woken up s and when 
required.

I propose to use it incrementally in postgresql. Let's start with I/O. When a 
block of data is being read, rather than blocking for read, we can set up 
creator-consumer link between two threads That we way can utilize that I/O 
time in a overlapped fashion.

Further threads can be useful when the server has more CPUs. It can spread CPU 
intensive work to different threads such as index creation or sorting. This 
way we can utilise idle CPU which we can not as of now.

There are many advantages that I can see.

1)Threads can be optionally turned on/off depending upon the configuration. So 
we can entirely keep existing functionality and convert them one-by-one to 
threaded application.

2)For each functionality we can have two code branches, one that do not use 
threads i.e. current code base and one that can use threads. Agreed the 
binary will be bit bloated but that would give enormous flexibility. If we 
find a thread implementation buggy, we simply switch it off either in 
compilation or inconfiguration.

3) Not much efforts should be required to plug code into this model. The idea 
of using threads is to assign exclusive work to each thread. So that should 
not require much of a locking.

In case of using multiple CPUs, separate functions need be written that can 
handle the things in a thread-safe fashion. Also a merger function would be 
required which would merge results of worker threads. That would be totally 
additional.

I would say two threads per CPU per back-end should be a reasonable default as 
that would cover I/O blocking well. Of course unless threading is turned off 
in build or in configuration.

Please note that I have tested the code in C++ and my C is rusty. Quite likely 
there are bugs in the code. I will stress test the code on monday but I would 
like to seek an opinion on this as soon as possible. ( Hey but it compiles 
clean..)

If required I can post example usage of this code, but I don't think that 
should be necessary.:-)

Bye
 Shridhar

Attachment: thread.c
Description: text/x-csrc (2.9 KB)
Attachment: thread.h
Description: text/x-chdr (884 bytes)

Responses

pgsql-hackers by date

Next:From: Robert TreatDate: 2003-01-03 15:35:35
Subject: Re: PostgreSQL Password Cracker
Previous:From: mlwDate: 2003-01-03 13:45:13
Subject: Re: Upgrading rant.

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group