From: | "Dann Corbit" <DCorbit(at)connx(dot)com> |
---|---|
To: | "PGHackers" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Threads |
Date: | 2003-01-03 20:52:48 |
Message-ID: | D90A5A6C612A39408103E6ECDD77B829408A20@voyager.corporate.connx.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> -----Original Message-----
> From: mlw [mailto:pgsql(at)mohawksoft(dot)com]
> Sent: Friday, January 03, 2003 12:47 PM
> To: Shridhar Daithankar
> Cc: PGHackers
> Subject: Re: [HACKERS] Threads
>
>
> Please no threading threads!!!
>
> Has anyone calculated the interval and period of "PostgreSQL needs
> threads" posts?
>
> The *ONLY* advantage threading has over multiple processes is
> the time
> and resources used in creating new processes.
Threading is absurdly easier to do portably than fork().
Will you fork() successfully on MVS, VMS, OS/2, Win32?
On some operating systems, thread creation is absurdly faster than
process creation (many orders of magnitude).
> That being said, I admit that creating a threaded program is
> easier than
> one with multiple processes, but PostgreSQL is already there
> and working.
>
> Drawbacks to a threaded model:
>
> (1) One thread screws up, the whole process dies. In a
> multiple process
> application this is not too much of an issue.
If you use C++ you can try/catch and nothing bad happens to anything but
the naughty thread.
> (2) Heap fragmentation. In a long uptime application, such as a
> database, heap fragmentation is an important consideration. With
> multiple processes, each process manages its own heap and what ever
> fragmentation that exists goes away when the connection is closed. A
> threaded server is far more vulnerable because the heap has to manage
> many threads and the heap has to stay active and unfragmented in
> perpetuity. This is why Windows applications usually end up
> using 2G of
> memory after 3 months of use. (Well, this AND memory leaks)
Poorly written applications leak memory. Fragmentation is a legitimate
concern.
> (3) Stack space. In a threaded application they are more
> limits to stack
> usage. I'm not sure, but I bet PostgreSQL would have a problem with a
> fixed size stack, I know the old ODBC driver did.
A single server with 20 threads will consume less total free store
memory and automatic memory than 20 servers. You have to decide how
much stack to give a thread, that's true.
> (4) Lock Contention. The various single points of access in a process
> have to be serialized for multiple threads. heap allocation,
> deallocation, etc all have to be managed. In a multple process model,
> these resources would be separated by process contexts.
Semaphores are more complicated than critical sections. If anything, a
shared memory approach is more problematic and fragile, especially when
porting to multiple operating systems.
> (5) Lastly, why bother? Seriously? Process creation time is an issue
> true, but its an issue with threads as well, just not as bad.
> Anyone who
> is looking for performance should be using a connection pooling
> mechanism as is done in things like PHP.
>
> I have done both threaded and process servers. The threaded
> servers are
> easier to write. The process based severs are more robust. From an
> operational point of view, a "select foo from bar where x >
> y" will take
> he same amount of time.
Probably true. I think a better solution is a server that can start
threads or processes or both. But that's neither here nor there and I'm
certainly not volunteering to write it.
Here is a solution to the dilemma. Make the one who suggests the
feature be the first volunteer on the team that writes it.
Is it a FAQ? If not, it ought to be.
From | Date | Subject | |
---|---|---|---|
Next Message | D'Arcy J.M. Cain | 2003-01-03 20:53:13 | Re: python interface |
Previous Message | mlw | 2003-01-03 20:47:22 | Re: Threads |