Re: Shared memory

From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Thomas Hallgren <thomas(at)tada(dot)se>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, PL/Java Development <Pljava-dev(at)gborg(dot)postgresql(dot)org>
Subject: Re: Shared memory
Date: 2006-03-27 10:45:06
Message-ID: 20060327104506.GB30791@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pljava-dev

On Mon, Mar 27, 2006 at 10:57:21AM +0200, Thomas Hallgren wrote:
> Martijn,
>
> I tried a Socket approach. Using the new IO stuff that arrived with Java
> 1.4 (SocketChannel etc.), the performance is really good. Especially on
> Linux where an SMP machine show a 1 to 1.5 ratio between one process doing
> ping-pong between two threads and two processes doing ping-pong using a
> socket. That's acceptable overhead indeed and I don't think I'll be able to
> trim it much using a shared memory approach (the thread scenario uses Java
> monitor locks. That's the most efficient lightweight locking implementation
> I've come across).

Yeah, it's fairly well known that the distinctions between processes
and threads on linux is much smaller than on other OSes. Windows is
pretty bad, which is why threading is much more popular there.

> The real downside is that a call from SQL to PL/Java using the current
> in-process approach is really fast. It takes about 5 micro secs on my
> 2.8GHz i386 box. The overhead of an IPC-call on that box is about 18 micro
> secs on Linux and 64 micro secs on Windows. That's an overhead of between
> 440% and 1300% due to context switching alone. Yet, for some applications,

<snip>

This might take some more measurements but AIUI the main difference
between in-process and intra-process is that one has a JVM per
connection, the other one JVM shared. In that case might thoughts are
as follows:

- Overhead of starting JVM. If you can start the JVM in the postmaster
you might be able to avoid this. However, if you have to restart the
JVM each process, that's a cost.

- JIT overhead. For often used classes JIT compiling can help a lot
with speed. But if every class needs to be reinterpreted each time,
maybe that costs more than your IPC.

- Memory overhead. You meantioned this already.

- Are you optimising for many short-lived connections or a few
long-lived connections?

My gut feeling is that if someone creates a huge number of server-side
java functions that performence will be better by having one always
running JVM with highly JIT optimised code than having each JVM doing
it from scratch. But this will obviously need to be tested.

One other thing is that seperate processes give you the ability to
parallelize. For example, if a Java function does an SPI query, it can
receive and process results in parallel with the backend generating
them. This may not be easy to acheive with an in-process JVM.

Incidently, there are compilers these days that can compile Java to
native. Is this Java stuff setup in such a way that you can compile your
classes to native and load directly for the real speed-freaks? In that
case, maybe you should concentrate on relibility and flexibility and
still have a way out for functions that *must* be high-performance.

Hope this helps,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Hallgren 2006-03-27 12:48:23 Re: Shared memory
Previous Message Csaba Nagy 2006-03-27 10:16:00 Re: 8.2 planning features

Browse pljava-dev by date

  From Date Subject
Next Message Thomas Hallgren 2006-03-27 12:48:23 Re: Shared memory
Previous Message Thomas Hallgren 2006-03-27 08:57:21 Re: Shared memory