Re: Suggestion to add --continue-client-on-abort option to pgbench

From: Rintaro Ikeda <ikedarintarof(at)oss(dot)nttdata(dot)com>
To: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, 'Yugo Nagata' <nagata(at)sraoss(dot)co(dot)jp>
Cc: "slpmcf(at)gmail(dot)com" <slpmcf(at)gmail(dot)com>, "boekewurm+postgres(at)gmail(dot)com" <boekewurm+postgres(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Srinath Reddy Sadipiralla <srinath2133(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject: Re: Suggestion to add --continue-client-on-abort option to pgbench
Date: 2025-07-01 08:43:18
Message-ID: 55156885-7072-4a8a-826b-12ec2b13f0ef@oss.nttdata.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I've updated the previous patch based on your feedback. Below is a summary of
the changes from v4 to v5:

1. (v5-0001) Added documentation and removed some code paths in response to the
comments.

2. (v5-0001) Modified the condition to transition from CSTATE_WAIT_RESULT to
CSTATE_ERROR, instead of adding a condition in canRetryError(), which had
enabled clients continue after its transaction failed. This is because, when the
--continue-on-error option is set, clients do not retry failed transactions but
start new ones.

3. (v5-0002) Renamed the enumerator TSTATUS_OTHER_ERROR, which could be
mistakenly interpreted as being related to other SQL errors. It represents an
unknown transaction status, so it has been renamed to TSTATUS_UNKNOWN_ERROR.

On 2025/06/14 0:24, Yugo Nagata wrote:
> case PGRES_NONFATAL_ERROR:
> case PGRES_FATAL_ERROR:
> st->estatus = getSQLErrorStatus(PQresultErrorField(res,
> PG_DIAG_SQLSTATE));
> if (canRetryError(st->estatus))
> {
> if (verbose_errors)
> commandError(st, PQerrorMessage(st->con));
> goto error;
> }
> /* fall through */
>
> default:
> /* anything else is unexpected */
> pg_log_error("client %d script %d aborted in command %d query %d: %s",
> st->id, st->use_file, st->command, qrynum,
> PQerrorMessage(st->con));
> goto error;
> }
>
> When an SQL error other than a serialization or deadlock error occurs, an error message is
> output via pg_log_error in this code path. However, I think this should be reported only
> when verbose_errors is set, similar to how serialization and deadlock errors are handled when
> --continue-on-error is enabled

I think the error message logged via pg_log_error is useful when verbose_errors
is not specified, because it informs users that the client has exited. Without
it, users may not notice that something went wrong.

On 2025/06/27 19:59, Hayato Kuroda (Fujitsu) wrote:
> BTW, initially we were discussing the combination of --continue-on-error and
> --exit-on-abort. What it the conclusion?
> I feel the Nagata-san's point [1] is valid in this approach.

I agree with the conclusion. I've removed the code path that prohibited using
--continue-on-error and --exit-on-abort options together.

On 2025/06/30 15:02, Yugo Nagata wrote:
> On Fri, 27 Jun 2025 10:59:09 +0000
> "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
>
>>>> Retrying the failed transaction is not necessary when the transaction
>>>> failed due to SQL-level errors. Unlike real-world applications, pgbench
>>>> does not need to complete specific transaction successfully. In the case
>>>> of unique constraint violations, retrying the same transaction will
>>>> likely to result in the same error again.
>>
>> I intended here that clients could throw away the failed transaction and start
>> new one again in the case. I hope we are on the same page...
>
> Could I confirm what you mean by "start new one"?
>
> In the current pgbench, when a query raises an error (a deadlock or
> serialization failure), it can be retried using the same random state.
> This typically means the query will be retried with the same parameter values.
>
> On the other hand, when the query ultimately fails (possibly after some retries),
> the transaction is marked as a "failure", and the next transaction starts with a
> new random state (i.e., with new parameter values).
>
> Therefore, if a query fails due to a unique constraint violation and is retried
> with the same parameters, it will keep failing on each retry.

Thank you for your explanation. I understand it as you described. I've also
attached a schematic diagram of the state machine. I hope it will help clarify
the behavior of pgbench. Red arrows represent the transition of state when SQL
command fails and --continue-on-error option is specified.

Best Regards,
Rintaro Ikeda

Attachment Content-Type Size
v5-0001-Add-continue-on-error-option-to-pgbench.patch text/plain 15.8 KB
v5-0002-Rename-confusing-enumerator.patch text/plain 1.1 KB
StateMachine_pgbench.jpg image/jpeg 117.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2025-07-01 08:47:06 Re: Adding support for SSLKEYLOGFILE in the frontend
Previous Message Peter Eisentraut 2025-07-01 08:31:38 Re: C11 / VS 2019