Quick Links

Re: Suggestion to add --continue-client-on-abort option to pgbench

From:	Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>
To:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc:	Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, Rintaro Ikeda <ikedarintarof(at)oss(dot)nttdata(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, "slpmcf(at)gmail(dot)com" <slpmcf(at)gmail(dot)com>, "boekewurm+postgres(at)gmail(dot)com" <boekewurm+postgres(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Srinath Reddy Sadipiralla <srinath2133(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject:	Re: Suggestion to add --continue-client-on-abort option to pgbench
Date:	2025-09-19 15:21:19
Message-ID:	20250920002119.c3c75a4cae1daf69789db45f@sraoss.co.jp
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Fri, 19 Sep 2025 19:21:29 +0900
Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:

> On Fri, Sep 19, 2025 at 11:43 AM Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> >
> > On Thu, Sep 18, 2025 at 4:20 PM Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp> wrote:
> > >
> > > On Thu, 18 Sep 2025 14:37:29 +0900
> > > Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> > >
> > > > On Thu, Sep 18, 2025 at 10:22 AM Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp> wrote:
> > > > > That makes sense. How about rewriting this like:
> > > > >
> > > > > However, if the --continue-on-error option is specified and the error occurs in
> > > > > an SQL command, the client does not abort and proceeds to the next
> > > > > transaction regardless of the error. These cases are reported as "other failures"
> > > > > in the output. Note that if the error occurs in a meta-command, the client will
> > > > > still abort even when this option is specified.
> > > >
> > > > How about phrasing it like this, based on your version?
> > > >
> > > > ----------------------------
> > > > A client's run is aborted in case of a serious error; for example, the
> > > > connection with the database server was lost or the end of script was reached
> > > > without completing the last transaction. The client also aborts
> > > > if a meta-command fails, or if an SQL command fails for reasons other than
> > > > serialization or deadlock errors when --continue-on-error is not specified.
> > > > With --continue-on-error, the client does not abort on such SQL errors
> > > > and instead proceeds to the next transaction. These cases are reported
> > > > as "other failures" in the output. If the error occurs in a meta-command,
> > > > however, the client still aborts even when this option is specified.
> > > > ----------------------------
> > >
> > > I'm fine with that. This version is clearer.
> >
> > Thanks for checking!
>
> I've updated the 0001 patch based on the comments.
> The revised version is attached.

Thank you for updating the patch.

>
> While testing, I found that running pgbench with --continue-on-error and
> pipeline mode triggers the following assertion failure. Could this be
> a bug in the patch?
>
> ---------------------------------------------------
> $ cat pipeline.pgbench
> \startpipeline
> DO $$
> BEGIN
> PERFORM pg_sleep(3);
> PERFORM pg_terminate_backend(pg_backend_pid());
> END $$;
> \endpipeline
>
> $ pgbench -n --debug --verbose-errors -f pipeline.pgbench -c 2 -t 4 -M
> extended --continue-on-error
> ...
> Assertion failed:
> (sql_script[st->use_file].commands[st->command]->type == 1), function
> commandError, file pgbench.c, line 3081.
> Abort trap: 6
> ---------------------------------------------------
>
> When I ran the same command without --continue-on-error,
> the assertion failure did not occur.

I think this bug was introduced by commit 4a39f87acd6e, which enabled pgbench
to retry and added the --verbose-errors option, rather than by this patch itself.

The assertion failure occurs in commandError(), which is called to report an error when
it can be retried (i.e., serializable failure or deadlock), or when --continue-on-error
is used after this patch.

Assert(sql_script[st->use_file].commands[st->command]->type == SQL_COMMAND);

This assumes the error is always detected during SQL command execution, but
that’s not correct, since in pipeline mode, the error can be detected when
a \endpipeline meta-command is executed.

$ cat deadlock.sql
\startpipeline
begin;
lock b;
lock a;
end;
\endpipeline

$ cat deadlock2.sql
\startpipeline
begin;
lock a;
lock b;
end;
\endpipeline

$ pgbench --verbose-errors -f deadlock.sql -f deadlock2.sql -c 2 -T 3 -M extended
pgbench (19devel)
starting vacuum...end.
pgbench: pgbench.c:3062: commandError: Assertion `sql_script[st->use_file].commands[st->command]->type == 1' failed.

Although one option would be to remove this assertion, if we prefer to keep it,
the attached patch fixes the issue.

Regards,
Yugo Nagata

--
Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>

Attachment	Content-Type	Size
fix_pgbench_assertion_failure_in_pipeline.patch.txt	text/plain	690 bytes

In response to

Re: Suggestion to add --continue-client-on-abort option to pgbench at 2025-09-19 10:21:29 from Fujii Masao

Responses

Re: Suggestion to add --continue-client-on-abort option to pgbench at 2025-09-22 02:56:31 from Fujii Masao

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2025-09-19 16:16:01	Re: Use opresulttype instead of calling SearchSysCache1() in match_orclause_to_indexcol()
Previous Message	Arseniy Mukhin	2025-09-19 15:13:54	Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue