Re: log bind parameter values on error

From: Andres Freund <andres(at)anarazel(dot)de>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Alexey Bashtanov <bashtanov(at)imap(dot)cc>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: log bind parameter values on error
Date: 2019-09-20 17:46:27
Message-ID: 20190920174627.qmaditduxlrrzb5l@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2019-09-18 17:58:53 -0300, Alvaro Herrera wrote:
> --- a/doc/src/sgml/config.sgml
> +++ b/doc/src/sgml/config.sgml
> @@ -6414,6 +6414,23 @@ log_line_prefix = '%m [%p] %q%u(at)%d/%a '
> </listitem>
> </varlistentry>
>
> + <varlistentry id="guc-log-parameters-on-error" xreflabel="log_parameters_on_error">
> + <term><varname>log_parameters_on_error</varname> (<type>boolean</type>)
> + <indexterm>
> + <primary><varname>log_parameters_on_error</varname> configuration parameter</primary>
> + </indexterm>
> + </term>
> + <listitem>
> + <para>
> + Controls whether the statement is logged with bind parameter values.

Trailing whitespace.

I find the "the statement" formulation a bit odd, but can't quite put my
finger on why.

I think it might be worthwhile to cross-reference
log_min_error_statement, as log_parameters_on_error will only take effect when the
statement is logged due to log_min_error_statement.

> + It adds some overhead, as postgres will cache textual
> + representations of parameter values in memory for all statements,
> + even if they eventually do not get logged. The default is
> + <literal>off</literal>. Only superusers can change this setting.

I don't think "cache" is the right descriptor. Usually caching implies
trying to make repeated tasks faster, which isn't the case here.

> +/*
> + * The top-level portal that the client is immediately working with:
> + * creating, binding, executing, or all at one using simple protocol
> + */
> +Portal current_top_portal = NULL;
> +

This strikes me as decidedly not nice. For one this means that we'll not
be able to use this infrastructure for binds that are done serverside,
e.g. as part of plpgsql. I'm basically inclined to think that
integrating this at the postges.c level is the wrong thing.

It also adds new error handling complexity, which is already quite
substantial. And spreads knowledge of portals to elog.c, which imo
shouldn't have to know about them.

It seems to me that this would really need to be tracked inside the
portal infrastructure. To avoid unnecessary overhead, we could continue
to set the text values in exec_bin_message() in the pformat == 0 case,
using hasTextValues somewhere in the portal code to determine whether
the text representation has to be computed (for other input format, and
internal queries as e.g. generated by plpgsql).

And then the PG_CATCHes in pquery.c can add the errdetail() in the error
cases using an econtext callback.

> +/*
> + * get_portal_bind_params
> + * Get a string containing parameters data -- used for logging.
> + *
> + * Can return NULL if there are no parameters in the portal or the portal is
> + * not valid, or the text representations of the parameters are not available.
> + * If returning a non-NULL value, it allocates memory for the returned string
> + * in the current context, and it's the caller's responsibility to pfree() it.
> + */
> +char *
> +get_portal_bind_params(ParamListInfo params)
> +{
> + StringInfoData param_str;
> +
> + /* No parameters to format */
> + if (!params || params->numParams == 0)
> + return NULL;
> +
> + /*
> + * We either need textual representation of parameters pre-calculated, or
> + * call potentially user-defined I/O functions to convert the internal
> + * representation into text, which cannot be done in an aborted xact
> + */
> + if (!params->hasTextValues && IsAbortedTransactionBlockState())
> + return NULL;

Maybe I'm naive, but what is the point of keeping the separate
parameters, allocated separately, when all we're doing is building a
string containing them all at error time? Seems better just directly
form the full string when decideding to keep the text parameters - for
one it'll often end up being more efficient. But more importantly it
also makes it a lot less likely to run out of memory while handling the
error. The individual text parameters can be large, and this will always
additionally need at least the combined size of all parameters from
within the error context. That's not great.

> + appendStringInfoCharMacro(&param_str, '\'');
> + for (p = pstring; *p; p++)
> + {
> + if (*p == '\'') /* double single quotes */
> + appendStringInfoCharMacro(&param_str, *p);
> + appendStringInfoCharMacro(&param_str, *p);
> + }
> + appendStringInfoCharMacro(&param_str, '\'');

I know this is old code, but: This is really inefficient. Will cause a
lot of unnecessary memory-reallocations for large text outputs, because
we don't immediately grow to at least close to the required size.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2019-09-20 17:57:04 Re: Add "password_protocol" connection parameter to libpq
Previous Message Andres Freund 2019-09-20 17:08:31 Re: subscriptionCheck failures on nightjar