Re: libpq object hooks

From: "Merlin Moncure" <mmoncure(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>, "Andrew Chernow" <ac(at)esilo(dot)com>
Subject: Re: libpq object hooks
Date: 2008-05-16 21:38:02
Message-ID: b42b73150805161438y2c6d4373sb036d0ed649ddf88@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

On Fri, May 16, 2008 at 4:23 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> "Merlin Moncure" <mmoncure(at)gmail(dot)com> writes:
>> Right. I actually overlooked the 'passthrough' in
>> PQregisterEventProc. It turns out that we are still not quite on the
>> same page and this needs to be clarified before we move on. The
>> passthrough cannot exist...
>
> Yes, it *will* exist. You are assuming that the goals you have for
> these hooks are the only ones anyone will ever have. There are other
> possible usage patterns and many of them will need some passthrough
> state data. I'd venture to say that anytime I've ever used a callback
> design that did not include a passthrough, I've had reason to curse
> its designer sooner or later (qsort being a pretty typical example).

right. It can exist, just not hold the event data (the extended
properties of PGresult/conn). It has a meaning and scope that are not
defined for libpq purposes. For the 'result' callbacks (see below),
we will just use the passthrough passed in for the connection.
libpqtypes will not use this, and we were getting nervous about people
understanding what the different pointers were used for (we would call
it a user pointer, not a pass through).

> However, it's clear that we are not on the same page, because what
> I thought you wanted the PQeventData function to return was the
> passthrough pointer. Evidently that's not what you want it to do,
> so you'd better back up a few steps and say what you do want it to do.
> (I think that a function to get the passthrough value associated with
> a given hook function might be useful too, but it's clear that what

We will do this:
void *PQeventData(PGconn*, PGeventProc, void **passthrough);
void *PQresultEventData(PGresult*, PGeventProc, void **passthrough);

> you are after must be something else.)

The major challenge is that libpqtypes (and, presumably, other
libraries) must essentially store its own data in both conn and
result. We allocate our data when these structures are created and
free it when they are destroyed. The idea is that libpq fires
callbacks at appropriate times when conn/results are
constructed/destructed. From these events we set up our private
'event' data and delete it. Each connection and result maintains a
'list' of private event states, one for each registering callback
(libpqtypes only requires one). So, registering an event proc is
analogous to adding one or fields to a conn or a result with a private
meaning.

So, libpq is defining 6 events:
*) when a connection is created (put into OK status)
*) when a connection is reset
*) when a connection is destroyed
*) when a result is created in non-error state
*) when a result is copied (we are adding this to libpq w/PQcopyResult)
*) when a result is destroyed

In these events we set up or tear down the private state for the libpq
objects. We need to sometimes look up the data from outside the
library in public (non callback) code. This is what PQeventData, etc.
accomplish...provide the private state for the object. The
difference is that there can be more than one registered event proc on
a conn/result at a time; thus the need for an additional "lookup"
value when getting event data (what was hookName and is now the event
proc address).

>> The purpose of the callbacks is to allow a hooking library to
>> establish private data that is associated with a PGconn or a PGresult.
>
> So you're imagining that on creation of a PGconn or PGresult, the
> hook function would be allowed to create some storage and libpq would
> then be responsible for tracking that storage? Okay, but that's a
> separate feature from the sort of passthrough I had in mind. Where
> exactly does the hook hand off the storage pointer to libpq? What
> are you going to do if the hook fails to create the storage
> (ie, out of memory during PGresult creation)?

Right, this could happen for one of two likely reasons: OOM as you
suggest or the callback fails for some reason only known to the
hooking library. In either case, the callback would set the return
pointer as defined by the structure to null, return FALSE, and libpq
would not add the state to the array of 'event state datas' it
maintains.

Here is the event proc with passthrough added:
int PGEventProc(PGEventId event, void *eventInfo, void *passThrough);
// FALSE = failure

>> Invoking PQregisterEventProc tells libpq 'I am interested in doing
>> this', for the supplied PGconn, future results created with that
>> connection, and (if PGconn is passed as null), all future connections.
>
> I am entirely serious about saying that I will not accept that last bit.
> To do that we have to buy into having permanent modifiable storage
> within libpq, protecting it with thread mutexes, etc etc. And I don't
> like any of the possible usage scenarios for it. I think hooking into a
> PGconn immediately after its creation is just as useful and a lot easier
> to manage.

Locking is not required so long as you follow certain conventions.
Here is our warning that describes this in the header:
+ * WARNING: This should only be called in the application's main thread
+ * before using any other libpq functions. This is not thread-safe and
+ * should be used prior to creating any application threads.

Obviously, the 'global' behavior is optional, but nice (it's kind of a
'gotcha' if you forget to re-register in a reconnect routine for
example). It is like an _init() routine, and considered doing it that
way. It does mean adding an array of 'global' event procs to libpq.
So I would ask you to hold your judgment on this, and, once we get the
new patch in, consider it and we will pull it if you still think it's
too dangerous (meaning, the convention is not followed, by a api
user).

>> Once that is established, libpq begins telling the hooking library
>> when and what needs to be allocated or deleted.
>
> Wait a minute --- you're trying to get between libpq and malloc?
> Why? That's getting a *whole* lot more invasive than I thought
> this patch intended to be, and I don't see a good reason for it.

No, we don't get in between. Maybe my wording was a little
misleading...libpq notifies the registered event proc implementations
of when a conn or result is created/destroyed. libpq, strictly
speaking, is not allocating the private states...just notifying other
code it's time to do it. There is no change to the way libpq
allocates things generally.

When we send up the revised patch, you will see the invasion is
minimal...the trickiest part was actually injecting the events in the
proper places.

merlin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2008-05-16 22:22:42 Re: missing $PostgreSQL:$
Previous Message Josh Berkus 2008-05-16 21:25:50 Re: [GSoC08]some detail plan of improving hash index

Browse pgsql-patches by date

  From Date Subject
Next Message Tom Lane 2008-05-16 23:41:15 Re: [HACKERS] TRUNCATE TABLE with IDENTITY
Previous Message Tom Lane 2008-05-16 20:23:16 Re: libpq object hooks