Re: [HACKERS] WIP aPatch: Pgbench Serialization and deadlock errors

From: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: Tatsuo Ishii <ishii(at)sraoss(dot)co(dot)jp>
Cc: tgl(at)sss(dot)pgh(dot)pa(dot)us, nagata(at)sraoss(dot)co(dot)jp, coelho(at)cri(dot)ensmp(dot)fr, thomas(dot)munro(at)gmail(dot)com, m(dot)polyakova(at)postgrespro(dot)ru, pgsql-hackers(at)postgresql(dot)org, teodor(at)sigaev(dot)ru
Subject: Re: [HACKERS] WIP aPatch: Pgbench Serialization and deadlock errors
Date: 2022-03-28 07:57:59
Message-ID: 202203280757.3tu4ovs3petm@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

On 2022-Mar-27, Tatsuo Ishii wrote:

> After:
> interval_start num_transactions sum_latency sum_latency_2 min_latency max_latency
> { failures | serialization_failures deadlock_failures } [ sum_lag sum_lag_2 min_lag max_lag [ skipped ] ] [ retried retries ]

You're showing an indentation, but looking at the HTML output there is
no such. Is the HTML processor eating leading whitespace or something
like that?

I think that the explanatory paragraph is way too long now, particularly
since it explains --failures-detailed starting in the middle. Also, the
example output doesn't include the failures-detailed mode. I suggest
that this should be broken down even more; first to explain the output
without failures-detailed, including an example, and then the output
with failures-detailed, and an example of that. Something like this,
perhaps:

Aggregated Logging
With the --aggregate-interval option, a different format is used for the log files (note that the actual log line is not folded).

interval_start num_transactions sum_latency sum_latency_2 min_latency max_latency
failures [ sum_lag sum_lag_2 min_lag max_lag [ skipped ] ] [ retried retries ]

where interval_start is the start of the interval (as a Unix epoch time stamp), num_transactions is the number of transactions within the interval, sum_latency is the sum of the transaction latencies within the interval, sum_latency_2 is the sum of squares of the transaction latencies within the interval, min_latency is the minimum latency within the interval, and max_latency is the maximum latency within the interval, failures is the number of transactions that ended with a failed SQL command within the interval.

The next fields, sum_lag, sum_lag_2, min_lag, and max_lag, are only present if the --rate option is used. They provide statistics about the time each transaction had to wait for the previous one to finish, i.e., the difference between each transaction's scheduled start time and the time it actually started. The next field, skipped, is only present if the --latency-limit option is used, too. It counts the number of transactions skipped because they would have started too late. The retried and retries fields are present only if the --max-tries option is not equal to 1. They report the number of retried transactions and the sum of all retries after serialization or deadlock errors within the interval. Each transaction is counted in the interval when it was committed.

Notice that while the plain (unaggregated) log file shows which script was used for each transaction, the aggregated log does not. Therefore if you need per-script data, you need to aggregate the data on your own.

Here is some example output:

1345828501 5601 1542744 483552416 61 2573 0
1345828503 7884 1979812 565806736 60 1479 0
1345828505 7208 1979422 567277552 59 1391 0
1345828507 7685 1980268 569784714 60 1398 0
1345828509 7073 1979779 573489941 236 1411 0

If you use option --failures-detailed, instead of the sum of all failed transactions you will get more detailed statistics for the failed transactions:

interval_start num_transactions sum_latency sum_latency_2 min_latency max_latency
serialization_failures deadlock_failures [ sum_lag sum_lag_2 min_lag max_lag [ skipped ] ] [ retried retries ]

This is similar to the above, but here the single 'failures' figure is replaced by serialization_failures which is the number of transactions that got a serialization error and were not retried after this, deadlock_failures which is the number of transactions that got a deadlock error and were not retried after this. The other fields are as above. Here is some example output:

[example with detailed failures]

--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
"If you have nothing to say, maybe you need just the right tool to help you
not say it." (New York Times, about Microsoft PowerPoint)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2022-03-28 08:02:14 Re: Allow file inclusion in pg_hba and pg_ident files
Previous Message Thomas Munro 2022-03-28 07:53:20 Re: DSA failed to allocate memory