Re: GUC assign hooks (was Re: wal_buffers = -1 and SIGHUP)

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bernd Helmle <mailings(at)oopsware(dot)de>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: GUC assign hooks (was Re: wal_buffers = -1 and SIGHUP)
Date: 2011-04-04 18:52:11
Message-ID: BANLkTinT-0E_QuZYGcTuEq5c4bVVxZtS_w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Apr 4, 2011 at 2:41 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I wrote:
>> IMO the real problem is essentially that GUC assign hooks have two
>> functions, checking and canonicalization of the value-to-be-stored
>> versus executing secondary actions when an assignment is made; and
>> there's no way to get at just the first one.  So we cannot canonicalize
>> the value first and then see if it's equal to the current setting.
>> I think the only real fix is to change the API for assign hooks.  This
>> is a bit annoying but it's not like we haven't ever done that before.
>> I'm thinking about splitting assign hooks into two functions, along the
>> lines of
>
>>       bool check_hook (datatype *new_value, GucSource source)
>>       bool assign_hook (datatype new_value, GucSource source)
>
> After perusing the existing assign_hook functions, I have some further
> thoughts about this proposal.
>
> * Many of the existing assign hooks do a nontrivial amount of
> computation (such as catalog lookups) to derive internal values from the
> presented string; examples include assign_temp_tablespaces and
> assign_timezone.  A naive implementation of the above design would
> require the assign_hook to repeat this computation after the check_hook
> already did it, which of course is undesirable.
>
> * Assign hooks that do catalog lookups need special pushups to avoid
> having to do such lookups while restoring a previous GUC setting during
> transaction abort (since lookups can't safely be performed in an aborted
> transaction).  Up to now we've used ad-hoc techniques for each such
> variable, as seen for instance in assign_session_authorization.  The
> usual idea is to modify the original string to store additional data,
> which then requires a custom show_hook to ensure only the original part
> of the string is shown to users.  The messiest aspect of this is that
> it must be possible to reliably tell a modified string from original
> user input.
>
> I think that we can avoid the first problem and clean up the second
> problem if we do this:
>
> 1. Code guc.c so that the check_hook is only applied to original user
> input.  When restoring a previous setting during abort (which
> necessarily will have been passed through the check_hook at an earlier
> time), only call the assign_hook.
>
> 2. Guarantee that the string pointer passed to the assign_hook is
> exactly what was returned by the check_hook, ie, guc.c is not allowed
> to duplicate or copy that string.
>
> Given these rules, a check_hook and assign_hook could cooperate to store
> additional data in what guc.c thinks is just a pointer to a string
> value, ie, there can be more data after the terminating \0.  The
> assign_hook could work off just this additional data without ever doing
> a catalog lookup.  No special show_hook is needed.

The only thing this proposal has to recommend it is that the current
coding is even worse.

> Of course this only works for string GUC variables, but I'm finding it
> hard to visualize a case where a bool, int, float, or enum GUC could
> need a catalog lookup to interpret.  We could possibly legislate that
> all of these are separately malloc'd to allow the same type of trick to
> be applied across the board, but I think that's overkill.  We can just
> tell people they must use a string GUC if they need hidden data.
>
> This technique would need documentation of course, but at least it
> *would* be documented rather than ad-hoc for each variable.
>
> Another variant would be to allow the check_hook to pass back a separate
> "void *" value that could be passed on to the assign_hook, containing
> any necessary derived data.  This is logically a bit cleaner, and would
> work for all types of GUC variables; but it would make things messier in
> guc.c since there would be an additional value to pass around.  I'm not
> convinced it's worth that, but could be talked into it if anyone feels
> strongly about it.
>
> Another thing I was reminded of while perusing the code is the comment
> for GUC_complaint_elevel:
>
>  * At some point it'd be nice to replace this with a mechanism that allows
>  * the custom message to become the DETAIL line of guc.c's generic message.
>
> The reason we invented GUC_complaint_elevel in the first place was to
> avoid a change in the signatures of assign hooks.  If we are making such
> a change, now would be the time to fix it, because we're never gonna fix
> it otherwise.  I see a few ways we could do it:
>
> 1. Add a "char **errdetail" parameter to assign_hooks, which guc.c would
> initialize to NULL before call.  If the hook stores a non-null pointer
> there, guc.c would include that string as errdetail.  This is the least
> effort from guc.c's viewpoint, but will be relatively painful to use
> from the hook's standpoint, because it'd generally have to palloc some
> space, or maybe even set up a StringInfo buffer to contain the generated
> message.
>
> 2. Add a "char *errdetail" parameter to assign_hooks, which points at
> a local-variable buffer in the calling routine, of a well-known size
> (think GUC_ERRDETAIL_BUFSIZE macro in guc.h).  Then hooks do something
> like
>        snprintf(errdetail, GUC_ERRDETAIL_BUFSIZE, _("format"), ...);
> to return an error detail string.
>
> 3. Create a function in guc.c for hooks to call, along the lines of
>        void GUC_assign_errdetail(const char *format, ...)
> The simplest implementation of this would rely on the assumption that
> assign-hook-calling isn't re-entrant, so it could just store the
> formatted string in a static variable.  That seems a bit ugly, but at
> least the ugliness would be hidden in a well-defined place, where it
> could be fixed locally if the assumption ever breaks down.
>
> At this point I'm leaning to #3 but wondered if anyone would see it
> as either overkill or too ugly.  With any of these variants, we could
> forget about my previous suggestion of adding an explicit elevel
> parameter to assign hook calls, since the hooks would no longer need
> to know the elog level to use anyway.

Definitely +1 for #3.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-04-04 18:56:25 Re: time table for beta1
Previous Message Kevin Grittner 2011-04-04 18:50:37 Re: time table for beta1