Re: Shared memory

From: Dave Cramer <pg(at)fastcrypt(dot)com>
To: Thomas Hallgren <thomas(at)tada(dot)se>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, PL/Java Development <Pljava-dev(at)gborg(dot)postgresql(dot)org>
Subject: Re: Shared memory
Date: 2006-03-28 18:25:36
Message-ID: 2606EB97-424C-4C15-A3F6-AED510E5B37F@fastcrypt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pljava-dev


On 28-Mar-06, at 10:48 AM, Thomas Hallgren wrote:

> Hi Simon,
> Thanks for your input. All good points. I actually did some work
> using Java stored procedures on DB2 a while back but I had managed
> to forget (or repress :-) ) all about the FENCED/NOT FENCED stuff.
> The current discussion definitely puts it in a different
> perspective. I think PL/Java has a pretty good 'NOT FENCED'
> implementation, as does many other PL's, but no PL has yet come up
> with a FENCED solution.

What exactly is a FENCED solution ? If it is simply a remote
connection to a single JVM then pl-j already does that.
>
> This FENCED/NOT FENCED terminology would be a good way to
> differentiate between the two approaches. Any chance of that syntax
> making it into the PostgreSQL grammar, should the need arise?
>
> Some more comments inline:
>
> Simon Riggs wrote:
>> Just some thoughts from afar: DB2 supports in-process and out-of-
>> process
>> external function calls (UDFs) that it refers to as UNFENCED and
>> FENCED
>> procedures. For Java only, IBM have moved to supporting *only* FENCED
>> procedures for Java functions, i.e. having a single JVM for all
>> connections.
> >
> Are you sure about this? As I recall it a FENCED stored procedure
> executed in a remote JVM of it's own. A parameter could be used
> that either caused a new JVM to be instantiated for each stored
> procedure call or to be kept for the duration of the session. The
> former would yield really horrible performance but keep memory
> utilization at a minimum. The latter would get a more acceptable
> performance but waste more memory (in par with PL/Java today).
>
>
>> Each connection's Java function runs as a thread on a
>> single dedicated JVM-only process.
> If that was true, then different threads could share dirty session
> data. I wanted to do that using DB2 but found it impossible. That
> was a while back though.
>
>> That approach definitely does increase the invocation time, but it
>> significantly reduces the resources associated with the JVM, as
>> well as
>> allowing memory management to be more controllable (bliss...). So the
>> overall picture could be more CPU and memory resources for each
>> connection in the connection pool.
> My very crude measurements indicate that the overhead of using a
> separate JVM is between 6-15MB of real memory per connection.
> Today, you get about 10MB/$ and servers configured with 4GB RAM or
> more are not uncommon.
>
> I'm not saying that the overhead doesn't matter. Of course it does.
> But the time when you needed to be extremely conservative with
> memory usage has passed. It might be far less expensive to buy some
> extra memory then to invest in SMP architectures to minimize IPC
> overhead.
>
> My point is, even fairly large app-servers (using connection pools
> with up to 200 simultaneous connections) can run using relatively
> inexpensive boxes such as an AMD64 based server with 4GB RAM and
> show very good throughput with the current implementation.
>
>
>> If you have a few small Java functions centralisation would not be
>> good,
>> but if you have a whole application architecture with many
>> connections
>> executing reasonable chunks of code then this can be a win.
> One thing to remembered is that a 'chunk of code' that executes in
> a remote JVM and uses JDBC will be hit by the IPC overhead on each
> interaction over the JDBC connection. I.e. the overhead is not just
> limited to the actual call of the UDF, it's also imposed on all
> database accesses that the UDF makes in turn.
>
>
>> In that environment we used Java for major database functions,
>> with SQL
>> functions for small extensions.
> My guess is that those major database functions did a fair amount
> of JDBC. Am I right?
>
>
>> Also the Java invocation time we should be celebrating is that by
>> having
>> Java in the database the Java<->DB time is much less than it would
>> be if
>> we had a Java stack sitting on another server.
>
> I think the cases when you have a Tomcat or JBoss sitting on the
> same physical server as the actual database are very common. One
> major reason being that you don't want network overhead between the
> middle tier and the backend. Moving logic into the database instead
> of keeping it in the middle tier is often done to get rid of the
> last hurdle, the overhead of IPC.
>
>
> Regards,
> Thomas Hallgren
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faq
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dave Cramer 2006-03-28 18:27:02 Re: Shared memory
Previous Message Simon Riggs 2006-03-28 18:11:53 Re: [GENERAL] PANIC: heap_update_redo: no block

Browse pljava-dev by date

  From Date Subject
Next Message Dave Cramer 2006-03-28 18:27:02 Re: Shared memory
Previous Message Simon Riggs 2006-03-28 17:56:22 Re: Shared memory