Re: GUC assign hooks (was Re: wal_buffers = -1 and SIGHUP)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Bernd Helmle <mailings(at)oopsware(dot)de>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: GUC assign hooks (was Re: wal_buffers = -1 and SIGHUP)
Date: 2011-04-04 18:41:08
Message-ID: 3158.1301942468@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> IMO the real problem is essentially that GUC assign hooks have two
> functions, checking and canonicalization of the value-to-be-stored
> versus executing secondary actions when an assignment is made; and
> there's no way to get at just the first one. So we cannot canonicalize
> the value first and then see if it's equal to the current setting.
> I think the only real fix is to change the API for assign hooks. This
> is a bit annoying but it's not like we haven't ever done that before.
> I'm thinking about splitting assign hooks into two functions, along the
> lines of

> bool check_hook (datatype *new_value, GucSource source)
> bool assign_hook (datatype new_value, GucSource source)

After perusing the existing assign_hook functions, I have some further
thoughts about this proposal.

* Many of the existing assign hooks do a nontrivial amount of
computation (such as catalog lookups) to derive internal values from the
presented string; examples include assign_temp_tablespaces and
assign_timezone. A naive implementation of the above design would
require the assign_hook to repeat this computation after the check_hook
already did it, which of course is undesirable.

* Assign hooks that do catalog lookups need special pushups to avoid
having to do such lookups while restoring a previous GUC setting during
transaction abort (since lookups can't safely be performed in an aborted
transaction). Up to now we've used ad-hoc techniques for each such
variable, as seen for instance in assign_session_authorization. The
usual idea is to modify the original string to store additional data,
which then requires a custom show_hook to ensure only the original part
of the string is shown to users. The messiest aspect of this is that
it must be possible to reliably tell a modified string from original
user input.

I think that we can avoid the first problem and clean up the second
problem if we do this:

1. Code guc.c so that the check_hook is only applied to original user
input. When restoring a previous setting during abort (which
necessarily will have been passed through the check_hook at an earlier
time), only call the assign_hook.

2. Guarantee that the string pointer passed to the assign_hook is
exactly what was returned by the check_hook, ie, guc.c is not allowed
to duplicate or copy that string.

Given these rules, a check_hook and assign_hook could cooperate to store
additional data in what guc.c thinks is just a pointer to a string
value, ie, there can be more data after the terminating \0. The
assign_hook could work off just this additional data without ever doing
a catalog lookup. No special show_hook is needed.

Of course this only works for string GUC variables, but I'm finding it
hard to visualize a case where a bool, int, float, or enum GUC could
need a catalog lookup to interpret. We could possibly legislate that
all of these are separately malloc'd to allow the same type of trick to
be applied across the board, but I think that's overkill. We can just
tell people they must use a string GUC if they need hidden data.

This technique would need documentation of course, but at least it
*would* be documented rather than ad-hoc for each variable.

Another variant would be to allow the check_hook to pass back a separate
"void *" value that could be passed on to the assign_hook, containing
any necessary derived data. This is logically a bit cleaner, and would
work for all types of GUC variables; but it would make things messier in
guc.c since there would be an additional value to pass around. I'm not
convinced it's worth that, but could be talked into it if anyone feels
strongly about it.

Another thing I was reminded of while perusing the code is the comment
for GUC_complaint_elevel:

* At some point it'd be nice to replace this with a mechanism that allows
* the custom message to become the DETAIL line of guc.c's generic message.

The reason we invented GUC_complaint_elevel in the first place was to
avoid a change in the signatures of assign hooks. If we are making such
a change, now would be the time to fix it, because we're never gonna fix
it otherwise. I see a few ways we could do it:

1. Add a "char **errdetail" parameter to assign_hooks, which guc.c would
initialize to NULL before call. If the hook stores a non-null pointer
there, guc.c would include that string as errdetail. This is the least
effort from guc.c's viewpoint, but will be relatively painful to use
from the hook's standpoint, because it'd generally have to palloc some
space, or maybe even set up a StringInfo buffer to contain the generated
message.

2. Add a "char *errdetail" parameter to assign_hooks, which points at
a local-variable buffer in the calling routine, of a well-known size
(think GUC_ERRDETAIL_BUFSIZE macro in guc.h). Then hooks do something
like
snprintf(errdetail, GUC_ERRDETAIL_BUFSIZE, _("format"), ...);
to return an error detail string.

3. Create a function in guc.c for hooks to call, along the lines of
void GUC_assign_errdetail(const char *format, ...)
The simplest implementation of this would rely on the assumption that
assign-hook-calling isn't re-entrant, so it could just store the
formatted string in a static variable. That seems a bit ugly, but at
least the ugliness would be hidden in a well-defined place, where it
could be fixed locally if the assumption ever breaks down.

At this point I'm leaning to #3 but wondered if anyone would see it
as either overkill or too ugly. With any of these variants, we could
forget about my previous suggestion of adding an explicit elevel
parameter to assign hook calls, since the hooks would no longer need
to know the elog level to use anyway.

Comments?

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2011-04-04 18:42:07 Re: [HACKERS] Uppercase SGML entity declarations
Previous Message Kevin Grittner 2011-04-04 18:38:59 Disable optimization when in subtransaction