Re: Pl/Java - next step?

From: "Thomas Hallgren" <thhal(at)mailblocks(dot)com>
To: pg(at)fastcrypt(dot)com, "HORNYAK Laszlo" <hornyakl(at)inf(dot)elte(dot)hu>, "Rob Butler" <robert(dot)butler5(at)verizon(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Pl/Java - next step?
Date: 2004-02-24 07:57:10
Message-ID: 004e01c3faab$ceb07bd0$ed3016ac@Laoner
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On the re-use front it would be VERY nice if you could
> somehow have a single patch for PostgreSQL's C code that called a set of
> Java interfaces. Then each of your implementations could implement that
set
> of Java interfaces (one using JNI, the other using RMI). This would allow
> the user to swap between either implementation, but would also reduce the
> amount of similar C code in Postgres. Something I think the PostgreSQL
> hackers would much prefer.
>
> Later
> Rob
>
I understand you concern. I'm all for code reuse and all the advantages that
it will bring. In my experience however, the design patterns used for
solutions that involve RPC differs a great deal from the ones used when you
have in-process calls. The driving forces are quite different. Let me give
you a concrete example.

Let's assume that we implement a trigger function, triggered before update
and on each row.

Using RPC, you'd like to minimize the number of calls that are made between
the two processes. Ideally, you'd like to have one call only. This can be
achieved by packing all information you have in one structure (the old row,
the new row, parameters etc.) and pass that data, by value, to the remote
process. In the remote process, all options are now open. You can read
parameters, the old row, and the new row, etc. Typically, some change would
be made to the new row and it would be sent back to the caller, again the
data is passed by value and streamed.

Using in-process calls, you'd like to minimize resource consumption. Thus,
you want minimize copying of data and you want to make data available on
demand. So, a JNI solution would typically wrap the TriggerData in a Java
object with accessor methods that enables the Java developer to obtain the
old row, the new row, and the parameters. An old row would be a wrapper of
the actual HeapTuple contained in the TriggerData etc.. No copies anywhere
and no streaming. But a radically increased number of calls between the Java
and the C domain compared to the RPC solution.

Now, consider that the one and only motivation for the JNI approach is to
have extremely fast integration between C and Java. This is accomplished at
the cost of resource consumption caused by multiple JVM's. Also take into
account that the major drawback with the RPC approach is the high number of
RPC calls that will be the result of some scenarios. It becomes very clear
(at least to me) that in order to get the best out of each solution, it's
essential that we use different design patterns. Otherwise, we get a
situation where optimizing the former means degrading the latter.

IMO, we can make the solutions exactly similar from a Pl/Java user's
perspective, and when it comes to all Java code used to administer each
solution, but not in C-code.

Regards,

Thomas Hallgren

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message George A.J 2004-02-24 10:02:00 Enterprice support in PostgreSQL
Previous Message Joe Conway 2004-02-24 06:11:19 Re: dblink - custom datatypes NOW work :)