| From: | Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> |
|---|---|
| To: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com> |
| Cc: | Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>, Rintaro Ikeda <ikedarintarof(at)oss(dot)nttdata(dot)com>, Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, "slpmcf(at)gmail(dot)com" <slpmcf(at)gmail(dot)com>, "boekewurm+postgres(at)gmail(dot)com" <boekewurm+postgres(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Srinath Reddy Sadipiralla <srinath2133(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com> |
| Subject: | Re: Suggestion to add --continue-client-on-abort option to pgbench |
| Date: | 2025-11-14 07:44:38 |
| Message-ID: | 867A57A3-3AAF-4E02-85E9-71BE8EA4ACAB@gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
> On Nov 13, 2025, at 21:55, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>
> On Thu, Nov 13, 2025 at 4:09 PM Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp> wrote:
>> Thank you for your review!
>> I've attached an updated patch reflecting your suggestion.
>
> Thanks for updating the patch! LGTM.
>
> You mentioned that the assertion failure could occur when using \syncpipeline,
> but it seems that multiple PGRES_PIPELINE_SYNC results can also appear
> even without it, which can still trigger the same issue. For example,
> I was able to reproduce the assertion failure in v16 (which doesn't support
> \syncpipeline) with the following setup:
>
> --------------------------------
> $ cat deadlock.sql
> \startpipeline
> select * from a order by i for update;
> select 1;
> \endpipeline
>
> $ cat deadlock2.sql
> \startpipeline
> select * from a order by i desc for update;
> select 1;
> \endpipeline
>
> $ psql -c "create table a (i int primary key); insert into a
> values(generate_series(1,1000));"
>
> $ pgbench -n -j 4 -c 4 -T 5 -M extended -f deadlock.sql -f deadlock2.sql
> ...
> Assertion failed: (res == ((void *)0)), function discardUntilSync,
> file pgbench.c, line 3479.
> --------------------------------
>
> So I've updated the commit message to clarify that while using \syncpipeline
> makes the failure more likely, it can still occur without it. Since the issue
> can also happen in v15 and v16 (which both lack \syncpipeline), I plan to
> backpatch the fix to v15. The failure doesn't occur in v14 because it doesn't
> support retriable error retries.
>
> I've also made a few cosmetic tweaks to the patch. Attached is the updated
> version, which I plan to push.
>
> Regards,
>
> --
> Fujii Masao
> <v4-0001-pgbench-PG15-PG16-Fix-assertion-failure-when-discarding-res.txt><v4-0001-pgbench-Fix-assertion-failure-when-discarding-res.patch>
I think I was misunderstanding that “\syncpipeline” would recover the transaction. Once the confusion is resolved, I think v4 patch is overall good. Only one small comment:
```
+ else if (received_sync && res == NULL)
{
- /*
- * PGRES_PIPELINE_SYNC must be followed by another
- * PGRES_PIPELINE_SYNC or NULL; otherwise, assert failure.
- */
- Assert(res == NULL);
-
/*
* Reset ongoing sync count to 0 since all PGRES_PIPELINE_SYNC
* results have been discarded.
@@ -3601,6 +3610,15 @@ discardUntilSync(CState *st)
PQclear(res);
break;
}
```
As we now add “res==NULL” to the “else if”, once entering "else if (received_sync && res == NULL)”, res must be NULL, so "PQclear(res);” should be deleted. Leaving it there doesn’t harm today, but is error-prone, because if in future someone removes “res==NULL” from the “else if”, it will lead to double memory free, because after “break”, PQclear(res) will be called again.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Fujii Masao | 2025-11-14 07:47:37 | Re: CREATE/ALTER PUBLICATION improvements for syntax synopsis |
| Previous Message | Masahiko Sawada | 2025-11-14 07:29:44 | Re: POC: enable logical decoding when wal_level = 'replica' without a server restart |