Re: Suggestion to add --continue-client-on-abort option to pgbench

From: Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>
To: Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>
Cc: Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Rintaro Ikeda <ikedarintarof(at)oss(dot)nttdata(dot)com>, Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, "slpmcf(at)gmail(dot)com" <slpmcf(at)gmail(dot)com>, "boekewurm+postgres(at)gmail(dot)com" <boekewurm+postgres(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, "Srinath Reddy Sadipiralla" <srinath2133(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject: Re: Suggestion to add --continue-client-on-abort option to pgbench
Date: 2025-09-30 01:23:53
Message-ID: 20250930102353.e368966be54d568e87e3ea95@sraoss.co.jp
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 26 Sep 2025 11:44:42 +0900
Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp> wrote:

> On Thu, 25 Sep 2025 17:17:36 +0800
> Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> wrote:
>
> > Hi Yugo,
> >
> > Thanks for the patch. After reviewing it, I got a few small comments:
>
> Thank you for your reviewing and comments.
>
> > > On Sep 25, 2025, at 15:22, Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp> wrote:
> > >
> > > --
> > > Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp <mailto:nagata(at)sraoss(dot)co(dot)jp>>
> > > <v13-0003-Improve-error-messages-for-errors-that-cause-cli.patch><v13-0002-Add-continue-on-error-option.patch><v13-0001-Fix-assertion-failure-and-verbose-messages-in-pi.patch>
> >
> >
> > 1 - 0001
> > ```
> > @@ -3265,6 +3271,7 @@ readCommandResponse(CState *st, MetaCommand meta, char *varprefix)
> > PGresult *res;
> > PGresult *next_res;
> > int qrynum = 0;
> > + char *errmsg;
> > ```
> >
> > I think we should initialize errmsg to NULL. Compiler won’t auto initialize a local variable. If it happens to not enter the while loop, then errmsg will hold a random value, then pg_free(errmsg) will have trouble.
>
> I think this initialization is unnecessary, just like for res and next_res.
> If the code happens not to enter the while loop, pg_free(errmsg) will not be
> called anyway, since the error: label is only reachable from inside the loop.
>
> > 2 - 0002
> > ```
> > + <para>
> > + Allows clients to continue their run even if an SQL statement fails due to
> > + errors other than serialization or deadlock. Unlike serialization and deadlock
> > + failures, clients do not retry the same transactions but start new transaction.
> > + This option is useful when your custom script may raise errors due to some
> > + reason like unique constraints violation. Without this option, the client is
> > + aborted after such errors.
> > + </para>
> > ```
> >
> > A few nit suggestions:
> >
> > * “continue their run” => “continue running”
>
> Fixed.
>
> > * “clients to not retry the same transactions but start new transaction” => “clients do not retry the same transaction but start a new transaction instead"
>
> I see your point. Maybe we could follow Anthonin Bonnefoy's suggestion
> to use "proceed to the next transaction", as it may sound a bit more natural.
>
> > * “due to some reason like” => “for reasons such as"
>
> Fixed.
>
> > 3 - 0002
> > ```
> > + * Without --continue-on-error:
> > * failed (the number of failed transactions) =
> > ```
> >
> > Maybe add an empty line after “without” line.
>
> Makes sense. Fixed.
>
> > 4 - 0002
> > ```
> > + * When --continue-on-error is specified:
> > + *
> > + * failed (number of failed transactions) =
> > ```
> >
> > Maybe change to “With ---continue-on-error”, which sounds consistent with the previous “without”.
>
> Fixed.
>
> > 5 - 0002
> > ```
> > + int64 other_sql_failures; /* number of failed transactions for
> > + * reasons other than
> > + * serialization/deadlock failure, which
> > + * is counted if --continue-on-error is
> > + * specified */
> > ```
> >
> > How about rename this variable to “sql_errors”, which reflects to the new option name.
>
> I think it’s better to keep the current name, since the variable counts failed transactions,
> even though that happens to be equivalent to the number of SQL errors. It’s also consistent
> with the other variables, serialization_failures and deadlock_failures.
>
> > 6 - 0002
> > ```
> > @@ -4571,6 +4594,8 @@ getResultString(bool skipped, EStatus estatus)
> > return "serialization";
> > case ESTATUS_DEADLOCK_ERROR:
> > return "deadlock";
> > + case ESTATUS_OTHER_SQL_ERROR:
> > + return "other”;
> > ```
> >
> > I think this can just return “error”. I checked where this function is called, there will not be other words such as “error” appended.
>
> getResultString() is called to get a string that represents the type of error
> causing the transaction failure, so simply returning "error" doesn’t seem very
> useful.
>
> > 7 - 0002
> > ```
> > /* it can be non-zero only if max_tries is not equal to one */
> > @@ -6569,6 +6602,10 @@ printResults(StatsData *total,
> > sstats->deadlock_failures,
> > (100.0 * sstats->deadlock_failures /
> > script_total_cnt));
> > + printf(" - number of other failures: " INT64_FORMAT " (%.3f%%)\n",
> > + sstats->other_sql_failures,
> > + (100.0 * sstats->other_sql_failures /
> > + script_total_cnt));
> > ```
> >
> > Do we only want to print this number when “―continue-on-error” is given?
>
> We could do that, but this message is printed only when
> --failures-detailed is specified. So I think users would not mind
> if it shows that the number of other failures is zero, even when
> --continue-on-error is not specified.
>
> I would appreciate hearing other people's opinions on this.
>
>
> I've attached updated patches that include fixes for some of your
> suggestions and for Anthonin Bonnefoy's suggestion on the documentation.
>
> I also split the patch according to Fujii-san's suggestion.

Fujii-san, thank you for committing the patch that fixes the assertion failure.
I've attached the remaining patches so that cfbot stays green.

Regards,
Yugo Nagata

--
Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>

Attachment Content-Type Size
v15-0003-pgbench-Improve-error-messages-for-errors-that-c.patch text/x-diff 3.0 KB
v15-0002-pgbench-Add-continue-on-error-option.patch text/x-diff 14.9 KB
v15-0001-pgbench-Do-not-reference-error-message-after-ano.patch text/x-diff 2.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2025-09-30 01:58:11 Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
Previous Message Michael Paquier 2025-09-30 00:51:03 Re: [PATCH] Add tests for Bitmapset