Re: Spinlock performance improvement proposal

From: "D(dot) Hageman" <dhageman(at)dracken(dot)com>
To: Ian Lance Taylor <ian(at)airs(dot)com>
Cc: mlw <markw(at)mohawksoft(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Spinlock performance improvement proposal
Date: 2001-09-26 23:18:08
Message-ID: Pine.LNX.4.33.0109261733050.2225-100000@typhon.dracken.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 26 Sep 2001, Ian Lance Taylor wrote:
>
> > Save for the fact that the kernel can switch between threads faster then
> > it can switch processes considering threads share the same address space,
> > stack, code, etc. If need be sharing the data between threads is much
> > easier then sharing between processes.
>
> When using a kernel threading model, it's not obvious to me that the
> kernel will switch between threads much faster than it will switch
> between processes. As far as I can see, the only potential savings is
> not reloading the pointers to the page tables. That is not nothing,
> but it is also not a lot.

It is my understanding that avoiding a full context switch of the
processor can be of a significant advantage. This is especially important
on processor architectures that can be kinda slow at doing it (x86). I
will admit that most modern kernels have features that assist software
packages utilizing the forking model (copy on write for instance). It is
also my impression that these do a good job. I am the kind of guy that
looks towards the future (as in a year, year and half or so) and say that
processors will hopefully get faster at context switching and more and
more kernels will implement these algorithms to speed up the forking
model. At the same time, I see more and more processors being shoved into
a single box and it appears that the threads model works better on these
type of systems.

> > I can't comment on the "isolate data" line. I am still trying to figure
> > that one out.
>
> Sometimes you need data which is specific to a particular thread.

When you need data that is specific to a thread you use a TSD (Thread
Specific Data).

> Basically, you have to look at every global variable in the Postgres
> backend, and determine whether to share it among all threads or to
> make it thread-specific.

Yes, if one was to implement threads into PostgreSQL I would think that
some re-writing would be in order of several areas. Like I said before,
give a person a chance to restructure things so future TODO items wouldn't
be so hard to implement. Personally, I like to stay away from global
variables as much as possible. They just get you into trouble.

> > That last line is a troll if I every saw it ;-) I will agree that threads
> > isn't for everything and that it has costs just like everything else. Let
> > me stress that last part - like everything else. Certain costs exist in
> > the present model, nothing is - how should we say ... perfect.
>
> When writing in C, threading inevitably loses robustness. Erratic
> behaviour by one thread, perhaps in a user defined function, can
> subtly corrupt the entire system, rather than just that thread. Part
> of defensive programming is building barriers between different parts
> of a system. Process boundaries are a powerful barrier.

I agree with everything you wrote above except for the first line. My
only comment is that process boundaries are only *truely* a powerful
barrier if the processes are different pieces of code and are not
dependent on each other in crippling ways. Forking the same code with the
bug in it - and only 1 in 5 die - is still 4 copies of buggy code running
on your system ;-)

> (Actually, though, Postgres is already vulnerable to erratic behaviour
> because any backend process can corrupt the shared buffer pool.)

I appreciate your total honest view of the situation.

--
//========================================================\\
|| D. Hageman <dhageman(at)dracken(dot)com> ||
\\========================================================//

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message D. Hageman 2001-09-26 23:32:32 Re: Spinlock performance improvement proposal
Previous Message Martín Marqués 2001-09-26 22:43:44 pg_dump bug