Threads

From: Shridhar Daithankar <shridhar_daithankar(at)persistent(dot)co(dot)in>
To: PGHackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Threads
Date: 2003-01-03 15:24:11
Message-ID: 200301032054.11125.shridhar_daithankar@persistent.co.in
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi all,

I am sure, many of you would like to delete this message before reading, hold
on. :-)

There is much talk about threading on this list and the idea is always
deferred for want of robust thread models across all supported platforms and
feasibility of gains v/s efforts required.

I think threads are useful in difference situations namely parallelising
blocking conditions and using multiple CPUs.

Attached is a framework that I ported to C from a C++ server I have written.
It has threadpool and threads implementation based on pthreads.

This code expects minimum pthreads implementation and does not assume anything
on threads part (e.g kernel threads or not etc.)

I request hackers on this list to take a look at it. It should be easily
pluggable in any source code and is released without any strings for any use.

This framework allows to plug-in the worker function and argument on the fly.
The threads created are sleeping by default and can be woken up s and when
required.

I propose to use it incrementally in postgresql. Let's start with I/O. When a
block of data is being read, rather than blocking for read, we can set up
creator-consumer link between two threads That we way can utilize that I/O
time in a overlapped fashion.

Further threads can be useful when the server has more CPUs. It can spread CPU
intensive work to different threads such as index creation or sorting. This
way we can utilise idle CPU which we can not as of now.

There are many advantages that I can see.

1)Threads can be optionally turned on/off depending upon the configuration. So
we can entirely keep existing functionality and convert them one-by-one to
threaded application.

2)For each functionality we can have two code branches, one that do not use
threads i.e. current code base and one that can use threads. Agreed the
binary will be bit bloated but that would give enormous flexibility. If we
find a thread implementation buggy, we simply switch it off either in
compilation or inconfiguration.

3) Not much efforts should be required to plug code into this model. The idea
of using threads is to assign exclusive work to each thread. So that should
not require much of a locking.

In case of using multiple CPUs, separate functions need be written that can
handle the things in a thread-safe fashion. Also a merger function would be
required which would merge results of worker threads. That would be totally
additional.

I would say two threads per CPU per back-end should be a reasonable default as
that would cover I/O blocking well. Of course unless threading is turned off
in build or in configuration.

Please note that I have tested the code in C++ and my C is rusty. Quite likely
there are bugs in the code. I will stress test the code on monday but I would
like to seek an opinion on this as soon as possible. ( Hey but it compiles
clean..)

If required I can post example usage of this code, but I don't think that
should be necessary.:-)

Bye
Shridhar

Attachment Content-Type Size
thread.h text/x-chdr 884 bytes
thread.c text/x-csrc 2.9 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Treat 2003-01-03 15:35:35 Re: PostgreSQL Password Cracker
Previous Message mlw 2003-01-03 13:45:13 Re: Upgrading rant.