Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

From: Marina Polyakova <m(dot)polyakova(at)postgrespro(dot)ru>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: pgsql-hackers(at)postgresql(dot)org, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Ildus Kurbangaliev <i(dot)kurbangaliev(at)postgrespro(dot)ru>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
Date: 2018-08-09 15:17:22
Message-ID: fcc2512cdc9e6bc49d3b489181f454da@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 09-08-2018 12:28, Fabien COELHO wrote:
> Hello Marina,

Hello!

>> v10-0002-Pgbench-errors-use-a-separate-function-to-report.patch
>> - a patch for a separate error reporting function (this is used to
>> report client failures that do not cause an aborts and this depends on
>> the level of debugging).
>
> Patch applies cleanly, compiles, global & local make check ok.

:-)

> This patch improves/homogenizes logging & error reporting in pgbench,
> in preparation for another patch which will manage transaction
> restarts in some cases.
>
> However ISTM that it is not as necessary as the previous one, i.e. we
> could do without it to get the desired feature, so I see it more as a
> refactoring done "in passing", and I'm wondering whether it is really
> worth it because it adds some new complexity, so I'm not sure of the
> net benefit.

We discussed this starting with [1]:

>>>> IMO this patch is more controversial than the other ones.
>>>>
>>>> It is not really related to the aim of the patch series, which could
>>>> do without, couldn't it?
>>>
>>>> I'd suggest that it should be an independent submission, unrelated
>>>> to
>>>> the pgbench error management patch.
>>>
>>> I suppose that this is related; because of my patch there may be a
>>> lot
>>> of such code (see v7 in [1]):
>>>
>>> - fprintf(stderr,
>>> - "malformed variable \"%s\" value: \"%s\"\n",
>>> - var->name, var->svalue);
>>> + if (debug_level >= DEBUG_FAILS)
>>> + {
>>> + fprintf(stderr,
>>> + "malformed variable \"%s\" value: \"%s\"\n",
>>> + var->name, var->svalue);
>>> + }
>>>
>>> - if (debug)
>>> + if (debug_level >= DEBUG_ALL)
>>> fprintf(stderr, "client %d sending %s\n", st->id, sql);
>>
>> I'm not sure that debug messages needs to be kept after debug, if it
>> is
>> about debugging pgbench itself. That is debatable.
>
> AFAICS it is not about debugging pgbench itself, but about more
> detailed
> information that can be used to understand what exactly happened during
> its launch. In the case of errors this helps to distinguish between
> failures or errors by type (including which limit for retries was
> violated and how far it was exceeded for the serialization/deadlock
> errors).
>
>>> That's why it was suggested to make the error function which hides
>>> all
>>> these things (see [2]):
>>>
>>> There is a lot of checks like "if (debug_level >= DEBUG_FAILS)" with
>>> corresponding fprintf(stderr..) I think it's time to do it like in
>>> the
>>> main code, wrap with some function like log(level, msg).
>>
>> Yep. I did not wrote that, but I agree with an "elog" suggestion to
>> switch
>>
>> if (...) { fprintf(...); exit/abort/continue/... }
>>
>> to a simpler:
>>
>> elog(level, ...)

> Anyway, I still have quite a few comments/suggestions on this version.

Thank you very much for them!

> * ErrorLevel
>
> If ErrorLevel is used for things which are not errors, its name should
> not include "Error"? Maybe "LogLevel"?

On the one hand, this sounds better for me too. On the other hand, will
not this be in some kind of conflict with error level codes in elog.h?..

/* Error level codes */
#define DEBUG5 10 /* Debugging messages, in categories of
* decreasing detail. */
#define DEBUG4 11
...

> I'm at odds with the proposed levels. ISTM that pgbench internal
> errors which warrant an immediate exit should be dubbed "FATAL",

Ok!

> which
> would leave the "ERROR" name for... errors, eg SQL errors. I'd suggest
> to use an INFO level for the PGBENCH_DEBUG function, and to keep LOG
> for main program messages, so that all use case are separate. Or,
> maybe the distinction between LOG/INFO is unclear so info is not
> necessary.

The messages of the errors in SQL and meta commands are printed only if
the option --debug-fails is used so I'm not sure that they should have a
higher error level than main program messages (ERROR vs LOG). About an
INFO level for the PGBENCH_DEBUG function - ISTM that some main program
messages such as "dropping old tables...\n" or ..." tuples (%d%%) done
(elapsed %.2f s, remaining %.2f s)\n" can also use it.. About that all
use cases were separate - in the current version the level LOG also
includes messages about abortions of the clients.

> I'm unsure about the "log_min_messages" variable name, I'd suggest
> "log_level".
>
> I do not see the asserts on LOG >= log_min_messages as useful, because
> the level can only be LOG or DEBUG anyway.

Ok!

> This point also suggest that maybe "pgbench_error" is misnamed as well
> (ok, I know I suggested it in place of ereport, but e stands for error
> there), as it is called on errors, but is also on other things. Maybe
> "pgbench_log"? Or just simply "log" or "report", as it is really an
> local function, which does not need a prefix? That would mean that
> "pgbench_simple_error", which is indeed called on errors, could keep
> its initial name "pgbench_error", and be called on errors.

About the name "log" - we already have the function doLog, so perhaps
the name "report" will be better.. But like with ErrorLevel will not
this be in some kind of conflict with ereport which is also used for the
levels DEBUG... / LOG / INFO?

> Alternatively, the debug/logging code could be let as it is (i.e.
> direct print to stderr) and the function only called when there is
> some kind of error, in which case it could be named with "error" in
> its name (or elog/ereport...).

As I wrote in [2]:

> because of my patch there may be a lot
> of such code (see v7 in [1]):
>
> - fprintf(stderr,
> - "malformed variable \"%s\" value: \"%s\"\n",
> - var->name, var->svalue);
> + if (debug_level >= DEBUG_FAILS)
> + {
> + fprintf(stderr,
> + "malformed variable \"%s\" value: \"%s\"\n",
> + var->name, var->svalue);
> + }
>
> - if (debug)
> + if (debug_level >= DEBUG_ALL)
> fprintf(stderr, "client %d sending %s\n", st->id, sql);
>
> That's why it was suggested to make the error function which hides all
> these things (see [2]):
>
> There is a lot of checks like "if (debug_level >= DEBUG_FAILS)" with
> corresponding fprintf(stderr..) I think it's time to do it like in the
> main code, wrap with some function like log(level, msg).

And IIUC macros will not help in the absence of __VA_ARGS__.

> * PQExpBuffer
>
> I still do not see a positive value from importing PQExpBuffer
> complexity and cost into pgbench, as the resulting code is not very
> readable and it adds malloc/free cycles, so I'd try to avoid using
> PQExpBuf as much as possible. ISTM that all usages could be avoided in
> the patch, and most should be avoided even if ExpBuffer is imported
> because it is really useful somewhere.
>
> - to call pgbench_error from pgbench_simple_error, you can do a
> pgbench_log_va(level, format, va_list) version called both from
> pgbench_error & pgbench_simple_error.
>
> - for PGBENCH_DEBUG function, do separate calls per type, the very
> small partial code duplication is worth avoiding ExpBuf IMO.
>
> - for doCustom debug: I'd just let the printf as it is, with a
> comment, as it is really very internal stuff for debug. Or I'd just
> snprintf a something in a static buffer.
>
> - for syntax_error: it should terminate, so it should call
> pgbench_error(FATAL, ...). Idem, I'd either keep the printf then call
> pgbench_error(FATAL, "syntax error found\n") for a final message,
> or snprintf in a static buffer.
>
> - for listAvailableScript: I'd simply call "pgbench_error(LOG" several
> time, once per line.
>
> I see building a string with a format (printfExpBuf..) and then
> calling the pgbench_error function with just a "%s" format on the
> result as not very elegant, because the second format is somehow
> hacked around.

Ok! About using a static buffer in doCustom debug or in syntax_error -
I'm not sure that this is always possible because ISTM that the variable
name can be quite large.

> * bool client
>
> I'm unconvince by this added boolean just to switch the level on
> encountered errors.
>
> I'd suggest to let lookupCreateVariable, putVariable* as they are,
> call pgbench_error with a level which does not stop the execution, and
> abort if necessary from the callers with a "aborted because of
> putVariable/eval/... error" message, as it was done before.

There's one more problem: if this is a client failure, an error message
inside any of these functions should be printed at the level
DEBUG_FAILS; otherwise it should be printed at the level LOG. Or do you
suggest using the error level as an argument for these functions?

> pgbench_error calls pgbench_error. Hmmm, why not.

[1]
https://www.postgresql.org/message-id/alpine.DEB.2.21.1806100837380.3655%40lancre
[2]
https://www.postgresql.org/message-id/b692de21caaed13c59f31c06d0098488%40postgrespro.ru

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-08-09 15:23:04 libpq connection timeout mismanagement
Previous Message Tom Lane 2018-08-09 15:05:02 libpq should not look up all host addresses at once