Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Marina Polyakova <m(dot)polyakova(at)postgrespro(dot)ru>, Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
Date: 2018-03-29 11:53:45
Message-ID: 8ab4a743-321a-1714-86ef-19fee348451d@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Conception of max-retry option seems strange for me. if number of retries
reaches max-retry option, then we just increment counter of failed transaction
and try again (possibly, with different random numbers). At the end we should
distinguish number of error transaction and failed transaction, to found this
difference documentation suggests to rerun pgbench with debugging on.

May be I didn't catch an idea, but it seems to me max-tries should be removed.
On transaction searialization or deadlock error pgbench should increment counter
of failed transaction, resets conditional stack, variables, etc but not a random
generator and then start new transaction for the first line of script.

Marina Polyakova wrote:
> On 26-03-2018 18:53, Fabien COELHO wrote:
>> Hello Marina,
>
> Hello!
>
>>> Many thanks to both of you! I'm working on a patch in this direction..
>>>
>>>> I think that the best approach for now is simply to reset (command
>>>> zero, random generator) and start over the whole script, without
>>>> attempting to be more intelligent. The limitations should be clearly
>>>> documented (one transaction per script), though. That would be a
>>>> significant enhancement already.
>>>
>>> I'm not sure that we can always do this, because we can get new errors until
>>> we finish the failed transaction block, and we need destroy the conditional
>>> stack..
>>
>> Sure. I'm suggesting so as to simplify that on failures the retry
>> would always restarts from the beginning of the script by resetting
>> everything, indeed including the conditional stack, the random
>> generator state, the variable values, and so on.
>>
>> This mean enforcing somehow one script is one transaction.
>>
>> If the user does not do that, it would be their decision and the
>> result becomes unpredictable on errors (eg some sub-transactions could
>> be executed more than once).
>>
>> Then if more is needed, that could be for another patch.
>
> Here is the fifth version of the patch for pgbench (based on the commit
> 4b9094eb6e14dfdbed61278ea8e51cc846e43579) where I tried to implement these
> ideas, thanks to your comments and those of Teodor Sigaev. Since we may need to
> execute commands to complete a failed transaction block, the script is now
> always executed completely. If there is a serialization/deadlock failure which
> can be retried, the script is executed again with the same random state and
> array of variables as before its first run. Meta commands  errors as well as all
> SQL errors do not cause the aborting of the client. The first failure in the
> current script execution determines whether the script run will be retried or
> not, so only such failures (they have a retry) or errors (they are not retried)
> are reported.
>
> I tried to make fixes in accordance with your previous reviews ([1], [2], [3]):
>
>> I'm unclear about the added example added in the documentation. There
>> are 71% errors, but 100% of transactions are reported as processed. If
>> there were errors, then it is not a success, so the transaction were
>> not
>> processed? To me it looks inconsistent. Also, while testing, it seems
>> that
>> failed transactions are counted in tps, which I think is not
>> appropriate:
>>
>>
>> About the feature:
>>
>>  sh> PGOPTIONS='-c default_transaction_isolation=serializable' \
>>        ./pgbench -P 1 -T 3 -r -M prepared -j 2 -c 4
>>  starting vacuum...end.
>>  progress: 1.0 s, 10845.8 tps, lat 0.091 ms stddev 0.491, 10474 failed
>>  # NOT 10845.8 TPS...
>>  progress: 2.0 s, 10534.6 tps, lat 0.094 ms stddev 0.658, 10203 failed
>>  progress: 3.0 s, 10643.4 tps, lat 0.096 ms stddev 0.568, 10290 failed
>>  ...
>>  number of transactions actually processed: 32028 # NO!
>>  number of errors: 30969 (96.694 %)
>>  latency average = 2.833 ms
>>  latency stddev = 1.508 ms
>>  tps = 10666.720870 (including connections establishing) # NO
>>  tps = 10683.034369 (excluding connections establishing) # NO
>>  ...
>>
>> For me this is all wrong. I think that the tps report is about
>> transactions
>> that succeeded, not mere attempts. I cannot say that a transaction
>> which aborted
>> was "actually processed"... as it was not.
>
> Fixed
>
>> The order of reported elements is not logical:
>>
>>  maximum number of transaction tries: 100
>>  scaling factor: 10
>>  query mode: prepared
>>  number of clients: 4
>>  number of threads: 2
>>  duration: 3 s
>>  number of transactions actually processed: 967
>>  number of errors: 152 (15.719 %)
>>  latency average = 9.630 ms
>>  latency stddev = 13.366 ms
>>  number of transactions retried: 623 (64.426 %)
>>  number of retries: 32272
>>
>> I would suggest to group everything about error handling in one block,
>> eg something like:
>>
>>  scaling factor: 10
>>  query mode: prepared
>>  number of clients: 4
>>  number of threads: 2
>>  duration: 3 s
>>  number of transactions actually processed: 967
>>  number of errors: 152 (15.719 %)
>>  number of transactions retried: 623 (64.426 %)
>>  number of retries: 32272
>>  maximum number of transaction tries: 100
>>  latency average = 9.630 ms
>>  latency stddev = 13.366 ms
>
> Fixed
>
>> Also, percent character should be stuck to its number: 15.719% to have
>> the style more homogeneous (although there seems to be pre-existing
>> inhomogeneities).
>>
>> I would replace "transaction tries/retried" by "tries/retried",
>> everything
>> is about transactions in the report anyway.
>>
>> Without reading the documentation, the overall report semantics is
>> unclear,
>> especially given the absurd tps results I got with the my first
>> attempt,
>> as failing transactions are counted as "processed".
>
> Fixed
>
>> About the code:
>>
>> I'm at lost with the 7 states added to the automaton, where I would
>> have hoped
>> that only 2 (eg RETRY & FAIL, or even less) would be enough.
>
> Fixed
>
>> I'm wondering whether the whole feature could be simplified by
>> considering that one script is one "transaction" (it is from the
>> report point of view at least), and that any retry is for the full
>> script only, from its beginning. That would remove the trying to guess
>> at transactions begin or end, avoid scanning manually for subcommands,
>> and so on.
>>  - Would it make sense?
>>  - Would it be ok for your use case?
>
> Fixed
>
>> The proposed version of the code looks unmaintainable to me. There are
>> 3 levels of nested "switch/case" with state changes at the deepest
>> level.
>> I cannot even see it on my screen which is not wide enough.
>
> Fixed
>
>> There should be a typedef for "random_state", eg something like:
>>
>>   typedef struct { unsigned short data[3]; } RandomState;
>>
>> Please keep "const" declarations, eg "commandFailed".
>>
>> I think that choosing script should depend on the thread random state,
>> not
>> the client random state, so that a run would generate the same pattern
>> per
>> thread, independently of which client finishes first.
>>
>> I'm sceptical of the "--debug-fails" options. ISTM that --debug is
>> already there
>> and should just be reused.
>
> Fixed
>
>> I agree that function naming style is a already a mess, but I think
>> that
>> new functions you add should use a common style, eg "is_compound" vs
>> "canRetry".
>
> Fixed
>
>> Translating error strings to their enum should be put in a function.
>
> Removed
>
>> I'm not sure this whole thing should be done anyway.
>
> The processing of compound commands is removed.
>
>> The "node" is started but never stopped.
>
> Fixed
>
>> For file contents, maybe the << 'EOF' here-document syntax would help
>> instead
>> of using concatenated backslashed strings everywhere.
>
> I'm sorry, but I could not get it to work with regular expressions :(
>
>> I'd start by stating (i.e. documenting) that the features assumes that one
>> script is just *one* transaction.
>>
>> Note that pgbench somehow already assumes that one script is one
>> transaction when it reports performance anyway.
>>
>> If you want 2 transactions, then you have to put them in two scripts,
>> which looks fine with me. Different transactions are expected to be
>> independent, otherwise they should be merged into one transaction.
>
> Fixed
>
>> Under these restrictions, ISTM that a retry is something like:
>>
>>    case ABORTED:
>>       if (we want to retry) {
>>          // do necessary stats
>>          // reset the initial state (random, vars, current command)
>>          state = START_TX; // loop
>>       }
>>       else {
>>         // count as failed...
>>         state = FINISHED; // or done.
>>       }
>>       break;
> ...
>> I'm fine with having END_COMMAND skipping to START_TX if it can be done
>> easily and cleanly, esp without code duplication.
>
> I did not want to add the additional if-expressions possibly to most of the code
> in CSTATE_START_TX/CSTATE_END_TX/CSTATE_END_COMMAND, so CSTATE_FAILURE is used
> instead of CSTATE_END_COMMAND in case of failure, and CSTATE_RETRY is called
> before CSTATE_END_TX if there was a failure during the current script execution.
>
>> ISTM that ABORTED & FINISHED are currently exactly the same. That would
>> put a particular use to aborted. Also, there are many points where the
>> code may go to "aborted" state, so reusing it could help avoid duplicating
>> stuff on each abort decision.
>
> To end and rollback the failed transaction block the script is always executed
> completely, and after the failure the following script command is executed..
>
> [1]
> https://www.postgresql.org/message-id/alpine.DEB.2.20.1801031720270.20034%40lancre
> [2]
> https://www.postgresql.org/message-id/alpine.DEB.2.20.1801121309300.10810%40lancre
> [3]
> https://www.postgresql.org/message-id/alpine.DEB.2.20.1801121607310.13422%40lancre
>

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Verite 2018-03-29 11:53:55 Re: csv format for psql
Previous Message Andrew Gierth 2018-03-29 11:37:05 Re: committing inside cursor loop