Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: Marina Polyakova <m(dot)polyakova(at)postgrespro(dot)ru>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org, Ildus Kurbangaliev <i(dot)kurbangaliev(at)postgrespro(dot)ru>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
Date: 2018-07-11 13:24:43
Message-ID: alpine.DEB.2.21.1807111435250.27883@lancre
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello Marina,

>> * -d/--debug: I'm not in favor in requiring a mandatory text argument on
>> this option.
> As you wrote in [1], adding an additional option is also a bad idea:

Hey, I'm entitled to some internal contradictions:-)

>> I'm sceptical of the "--debug-fails" options. ISTM that --debug is
>> already there and should just be reused.

I was thinking that you could just use the existing --debug, not change
its syntax. My point was that --debug exists, and you could just print
the messages when under --debug.

> Maybe it's better to use an optional argument/arguments for compatibility
> (--debug[=fails] or --debug[=NUM])? But if we use the numbers, now I can see
> only 2 levels, and there's no guarantee that they will no change..

Optional arguments to options (!) are not really clean things, so I'd like
to avoid going onto this path, esp. as I cannot see any other instance in
pgbench or elsewhere in postgres, and I personnaly consider these as a bad

So if absolutely necessary, a new option is still better than changing
--debug syntax. If not necessary, then it is better:-)

>> * I'm reserved about the whole ereport thing, see comments in other
>> messages.
> Thank you, I'll try to implement the error reporting in the way you
> suggested.

Dunno if it is a good idea either. The committer word is the good one in
the end:-à

> Thank you, I'll fix this.
> I'm sorry, I'll fix this.

You do not have to thank me or being sorry on every comment I do, once a
the former is enough, and there is no need for the later.

>> * doCustom changes.

>> On CSTATE_FAILURE, the next command is possibly started. Although there
>> is some consistency with the previous point, I think that it totally
>> breaks the state automaton where now a command can start while the
>> whole transaction is in failing state anyway. There was no point in
>> starting it in the first place.
>> So, for me, the FAILURE state should record/count the failure, then skip
>> to RETRY if a retry is decided, else proceed to ABORT. Nothing else.
>> This is much clearer that way.
>> Then RETRY should reinstate the global state and proceed to start the
>> *first* command again.
>> <...>
>> It is unclear to me why backslash command errors are turned to FAILURE
>> instead of ABORTED: there is no way they are going to be retried, so
>> maybe they should/could skip directly to ABORTED?

> So do you propose to execute the command "ROLLBACK" without calculating its
> latency etc. if we are in a failed transaction and clear the conditional
> stack after each failure?

> Also just to be clear: do you want to have the state CSTATE_ABORTED for
> client abortion and another state for interrupting the current transaction?

I do not understand what "interrupting the current transaction" means. A
transaction is either committed or rollbacked, I do not know about
"interrupted". When it is rollbacked, probably some stats will be
collected in passing, I'm fine with that.

If there is an error in a pgbench script, the transaction is aborted,
which means for me that the script execution is stopped where it was, and
either it is restarted from the beginning (retry) or counted as failure
(not retry, just aborted, really).

If by interrupted you mean that one script begins a transaction and
another ends it, as I said in the review I think that this strange case
should be forbidden, so that all the code and documentation trying to
manage that can be removed.

>> The current RETRY state does memory allocations to generate a message
>> with buffer allocation and so on. This looks like a costly and useless
>> operation. If the user required "retries", then this is normal behavior,
>> the retries are counted and will be printed out in the final report,
>> and there is no point in printing out every single one of them.
>> Maybe you want that debugging, but then coslty operations should be
>> guarded.
> I think we need these debugging messages because, for example,

Debugging message should cost only when under debug. When not under debug,
there should be no debugging message, and there should be no cost for
building and discarding such messages in the executed code path beyond
testing whether program is under debug.

> if you use the option --latency-limit, you we will never know in advance
> whether the serialization/deadlock failure will be retried or not.

ISTM that it will be shown final report. If I want debug, I ask for
--debug, otherwise I think that the command should do what it was asked
for, i.e. run scripts, collect performance statistics and show them at the

In particular, when running with retries is enabled, the user is expecting
deadlock/serialization errors, so that they are not "errors" as such for

> They also help to understand which limit of retries was violated or how
> close we were to these limits during the execution of a specific
> transaction. But I agree with you that they are costly and can be
> skipped if the failure type is never retried. Maybe it is better to
> split them into multiple error function calls?..

Debugging message costs should only be incurred when under --debug, not

>> You have added 20-columns alignment prints. This looks like too much and
>> generates much too large lines. Probably 10 (billion) would be enough.
> I have already asked you about this in [2]:


>> The variables for the numbers of failures and retries are of type int64
>> since the variable for the total number of transactions has the same
>> type. That's why such a large alignment (as I understand it now, enough
>> 20 characters). Do you prefer floating alignemnts, depending on the
>> maximum number of failures/retries for any command in any script?

An int64 counter is not likely to reach its limit anytime soon:-) If the
column display limit is ever reached, ISTM that then the text is just
misaligned, which is a minor and rare inconvenience. If very wide columns
are used, then it does not fit my terminal and the report text will always
be wrapped around, which makes it harder to read, every time.

>> The latency limit to 900 ms try is a bad idea because it takes a lot of
>> time. I did such tests before and they were removed by Tom Lane because
>> of determinism and time issues. I would comment this test out for now.
> Ok! If it doesn't bother you - can you tell more about the causes of these
> determinism issues?.. Tests for some other failures that cannot be retried
> are already added to

Some farm animals are very slow, so you cannot really assume much about
time one way or another.

>> Otherwise, maybe (simple) pgbench-side thread
>> barrier could help, but this would require more thinking.
> Tests must pass if we use --disable-thread-safety..

Sure. My wording was misleading. I just meant a synchronisation barrier
between concurrent clients, which could be managed with one thread.
Anyway, it is probably overkill for the problem at hand, so just forget.

>> I do not understand why there is so much text about in failed sql
>> transaction stuff, while we are mainly interested in serialization &
>> deadlock errors, and this only falls in some "other" category. There
>> seems to be more details about other errors that about deadlocks &
>> serializable errors.
>> The reporting should focus on what is of interest, either all errors,
>> or some detailed split of these errors.
>> <...>
>> * "errors_in_failed_tx" is some subcounter of "errors", for a special
>> case. Why it is there fails me [I finally understood, and I think it
>> should be removed, see end of review]. If we wanted to distinguish,
>> then we should distinguish homogeneously: maybe just count the
>> different error types, eg have things like "deadlock_errors",
>> "serializable_errors", "other_errors", "internal_pgbench_errors" which
>> would be orthogonal one to the other, and "errors" could be recomputed
>> from these.
> Thank you, I agree with you. Unfortunately each new error type adds a new 1
> or 2 columns of maximum width 20 to the per-statement report

The fact that some data are collected does not mean that they should all
be reported in detail. We can have detailed error count and report the sum
of this errors for instance, or have some more verbose/detailed reports
as options (eg --latencies does just that).

>> <...>
>> "If a failed transaction block does not terminate in the current script":
>> this just looks like a very bad idea, and explains my general ranting
>> above about this error condition. ISTM that the only reasonable option
>> is that a pgbench script should be inforced as a transaction, or a set of
>> transactions, but cannot be a "piece" of transaction, i.e. pgbench script
>> with "BEGIN;" but without a corresponding "COMMIT" is a user error and
>> warrants an abort, so that there is no need to manage these "in aborted
>> transaction" errors every where and report about them and document them
>> extensively.
>> This means adding a check when a script is finished or starting that
>> PQtransactionStatus(const PGconn *conn) == PQTRANS_IDLE, and abort if not
>> with a fatal error. Then we can forget about these "in tx errors" counting,
>> reporting and so on, and just have to document the restriction.
> Ok!

Good:-) ISTM that this would remove a significant amount of complexity
from the code and documentation.


In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2018-07-11 13:29:39 Re: Speeding up INSERTs and UPDATEs to partitioned tables
Previous Message Alvaro Herrera 2018-07-11 13:16:33 Re: Concurrency bug in UPDATE of partition-key