| From: | Tatsuo Ishii <ishii(at)postgresql(dot)org> |
|---|---|
| To: | nadav(at)tailorbrands(dot)com |
| Cc: | pgpool-hackers(at)lists(dot)postgresql(dot)org |
| Subject: | Re: Proposal: recent access based routing for primary-replica setups |
| Date: | 2025-12-28 23:48:44 |
| Message-ID: | 20251229.084844.1964712636774742758.ishii@postgresql.org |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgpool-hackers |
> thank you! works for me. should we merge both into master or do you want me
> to send a combined one?
> do you want me to send a combined one?
Yes, please send a combined one.
I will do more tests and detailed code review.
Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
> On Fri, Dec 26, 2025 at 12:03 PM Tatsuo Ishii <ishii(at)postgresql(dot)org> wrote:
>
>> Hi Nadav,
>>
>> I just want to make it clear. The patch should be applied on top of
>> your latest.patch.
>>
>> > (Please disregard previous mail. I seem to have mangled the message).
>> >
>> > I think I found a cause of the problem. On Linux, if SIGCHLD is
>> > ignored (set to SIG_IGN), waitpid() cannot get proper child status.
>> > Because the kernel relcaims the resource for the child process to not
>> > make the child process a zombie. And this makes waitpid() to fail with
>> > ECHLD. Since the return of waitpid() is not checked, I did not notice
>> > the waitpid() failure (I recommend to check the return value of
>> > waitpid()).
>> >
>> > /* set up signal handlers */
>> > signal(SIGALRM, SIG_DFL);
>> > signal(SIGTERM, my_signal_handler);
>> > signal(SIGINT, my_signal_handler);
>> > signal(SIGHUP, reload_config_handler);
>> > signal(SIGQUIT, my_signal_handler);
>> > signal(SIGCHLD, SIG_IGN); <--- SIGCHLD is ignored
>> > signal(SIGUSR1, my_signal_handler);
>> > signal(SIGUSR2, SIG_IGN);
>> > signal(SIGPIPE, SIG_IGN);
>> >
>> > To fix this, either change the line above to:
>> >
>> > signal(SIGCHLD, SIG_DFL);
>> > or
>> > signal(SIGCHLD, my_signal_handler);
>> > and modify my_signal_handler.
>> >
>> > I recommend the latter, because it does not depend on the default
>> > behavior of SIGCHLD, which might be different per platform.
>> > Attached is the patch to do this. (and run pgindent).
>> > I also notice that something like:
>> >
>> > /* Count tokens in output for validation */
>> > char *line_copy = pstrdup(line);
>> > char *temp_token = strtok(line_copy, " \t\n");
>> >
>> > You should declare line_copy and temp_token in the begging of the code
>> > block (or in the outer block). The forward declaration is recommended
>> > coding style in Pgpool-II (and PostgreSQL). Same thing can be said to
>> > some other variables.
>> >
>> > Best regards,
>> > --
>> > Tatsuo Ishii
>> > SRA OSS K.K.
>> > English: http://www.sraoss.co.jp/index_en/
>> > Japanese:http://www.sraoss.co.jp
>> >
>> >> Hi Tatsuo,
>> >>
>> >> Thank you for the note.
>> >>
>> >> I've removed the docker stuff. started working in an ubuntu 24 VM to
>> match
>> >> the setup. hopefully the results will be better, had so many issues
>> >> compiling and testing before that stuff wasn't properly formulated.
>> >>
>> >> Attaching the latest patch.
>> >>
>> >> this is what i'm seeing:
>> >> adav(at)lima-dev:/src/pgpool2/src/test/regression$ PGHOST=/tmp
>> ./regress.sh -p
>> >> /usr/bin 041.external_replication_delay
>> >> creating pgpool-II temporary installation ...
>> >> moving pgpool_setup to temporary installation path ...
>> >> moving watchdog_setup to temporary installation path ...
>> >> using pgpool-II at /src/pgpool2/src/test/regression/temp/installed
>> >> *************************
>> >> REGRESSION MODE : install
>> >> Pgpool-II version : pgpool-II version 4.8devel (mitsukakeboshi)
>> >> Pgpool-II install path :
>> /src/pgpool2/src/test/regression/temp/installed
>> >> PostgreSQL bin : /usr/lib/postgresql/16/bin
>> >> PostgreSQL Major version : 16
>> >> pgbench : /usr/lib/postgresql/16/bin/pgbench
>> >> PostgreSQL jdbc :
>> >> /usr/local/pgsql/share/postgresql-9.2-1003.jdbc4.jar
>> >> *************************
>> >> testing 041.external_replication_delay...ok.
>> >> out of 1 ok:1 failed:0 timeout:0
>> >>
>> >>
>> >>
>> >> On Tue, Dec 23, 2025 at 10:46 AM Tatsuo Ishii <ishii(at)postgresql(dot)org>
>> wrote:
>> >>
>> >>> > Hi Tatsuo,
>> >>> >
>> >>> > I'km running into issues testing this and have created a full docker
>> >>> > compose setup - can you please point me to up to date guides on the
>> best
>> >>> > way to run the tests so i know we're doing it the same way?
>> >>> >
>> >>> > Thank you for all your help!
>> >>>
>> >>> I have run the regression test on the Pgpool-II master branch on my
>> >>> Ubuntu 24 box.
>> >>>
>> >>> cd pgpool2/src/test/regression
>> >>> ./regress.sh 041
>> >>>
>> >>> This time I noticed:
>> >>>
>> >>> - The patch does not named with version number
>> >>> - The patch creates .dockerignore and docker/ directory.
>> >>>
>> >>> Are they intended? I am asking because they are different from the
>> >>> previous version.
>> >>>
>> >>> > On Tue, Dec 23, 2025 at 2:13 AM Tatsuo Ishii <ishii(at)postgresql(dot)org>
>> >>> wrote:
>> >>> >
>> >>> >> > I think everything is passing now. new version attached.
>> >>> >>
>> >>> >> Unfortunately Test1 did not pass.
>> >>> >>
>> >>> >> === Test1: Basic external command with integer millisecond values
>> ===
>> >>> >> waiting for server to start....1438600 2025-12-23 09:09:48.337 JST
>> LOG:
>> >>> >> redirecting log output to logging collector process
>> >>> >> 1438600 2025-12-23 09:09:48.337 JST HINT: Future log output will
>> appear
>> >>> >> in directory "log".
>> >>> >> done
>> >>> >> server started
>> >>> >> waiting for server to start....1438617 2025-12-23 09:09:48.443 JST
>> LOG:
>> >>> >> redirecting log output to logging collector process
>> >>> >> 1438617 2025-12-23 09:09:48.443 JST HINT: Future log output will
>> appear
>> >>> >> in directory "log".
>> >>> >> done
>> >>> >> server started
>> >>> >> waiting for server to start....1438634 2025-12-23 09:09:48.561 JST
>> LOG:
>> >>> >> redirecting log output to logging collector process
>> >>> >> 1438634 2025-12-23 09:09:48.561 JST HINT: Future log output will
>> appear
>> >>> >> in directory "log".
>> >>> >> done
>> >>> >> server started
>> >>> >> CREATE TABLE
>> >>> >> Waiting for sr_check to run...
>> >>> >> Command executed after 1 seconds
>> >>> >> node_id | hostname | port | status | pg_status | lb_weight |
>> role
>> >>> |
>> >>> >> pg_role | select_cnt | load_balance_node | replication_delay |
>> >>> >> replication_state | replication_sync_state | last_status_change
>> >>> >>
>> >>> >>
>> >>>
>> ---------+-----------+-------+--------+-----------+-----------+---------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
>> >>> >> 0 | localhost | 11002 | up | up | 0.333333 |
>> primary
>> >>> |
>> >>> >> primary | 0 | true | 0 |
>> >>> >> | | 2025-12-23 09:09:49
>> >>> >> 1 | localhost | 11003 | up | up | 0.333333 |
>> standby
>> >>> |
>> >>> >> standby | 0 | false | 0 |
>> >>> >> | | 2025-12-23 09:09:49
>> >>> >> 2 | localhost | 11004 | up | up | 0.333333 |
>> standby
>> >>> |
>> >>> >> standby | 0 | false | 0 |
>> >>> >> | | 2025-12-23 09:09:49
>> >>> >> (3 rows)
>> >>> >>
>> >>> >> fail: external command delay logging not found
>> >>> >>
>> >>> >> > On Mon, Nov 24, 2025 at 9:41 AM Tatsuo Ishii <
>> ishii(at)postgresql(dot)org>
>> >>> >> wrote:
>> >>> >> >
>> >>> >> >> Thank you for updating the patch! This time the patch applies
>> without
>> >>> >> >> any issue and compiles fine. Unfortunately regression test
>> failed.
>> >>> >> >>
>> >>> >> >> testing 041.external_replication_delay...failed.
>> >>> >> >>
>> >>> >> >> From the regression log, it seems Test7 failed.
>> >>> >> >>
>> >>> >> >>
>> >>> >>
>> >>>
>> ------------------------------------------------------------------------------
>> >>> >> >> === Test7: Command timeout handling ===
>> >>> >> >> waiting for server to start....411181 2025-11-24 16:31:05.244 JST
>> >>> LOG:
>> >>> >> >> redirecting log output to logging collector process
>> >>> >> >> 411181 2025-11-24 16:31:05.244 JST HINT: Future log output will
>> >>> appear
>> >>> >> in
>> >>> >> >> directory "log".
>> >>> >> >> done
>> >>> >> >> server started
>> >>> >> >> waiting for server to start....411196 2025-11-24 16:31:05.352 JST
>> >>> LOG:
>> >>> >> >> redirecting log output to logging collector process
>> >>> >> >> 411196 2025-11-24 16:31:05.352 JST HINT: Future log output will
>> >>> appear
>> >>> >> in
>> >>> >> >> directory "log".
>> >>> >> >> done
>> >>> >> >> server started
>> >>> >> >> waiting for server to start....411213 2025-11-24 16:31:05.461 JST
>> >>> LOG:
>> >>> >> >> redirecting log output to logging collector process
>> >>> >> >> 411213 2025-11-24 16:31:05.461 JST HINT: Future log output will
>> >>> appear
>> >>> >> in
>> >>> >> >> directory "log".
>> >>> >> >> done
>> >>> >> >> server started
>> >>> >> >> Waiting for command timeout...
>> >>> >> >> fail: command timeout not detected
>> >>> >> >>
>> >>> >> >>
>> >>> >>
>> >>>
>> ------------------------------------------------------------------------------
>> >>> >> >>
>> >>> >> >> Attached is the pgpool.log. If you need more info, please let me
>> >>> know.
>> >>> >> >>
>> >>> >> >> Best regards,
>> >>> >> >> --
>> >>> >> >> Tatsuo Ishii
>> >>> >> >> SRA OSS K.K.
>> >>> >> >> English: http://www.sraoss.co.jp/index_en/
>> >>> >> >> Japanese:http://www.sraoss.co.jp
>> >>> >> >>
>> >>> >> >>
>> >>> >> >> > Hi Tatsuo,
>> >>> >> >> >
>> >>> >> >> > Sorry again, this was due to the separation of 2 patches and i
>> only
>> >>> >> sent
>> >>> >> >> > the one.
>> >>> >> >> >
>> >>> >> >> > I've merged it into 1 commit and 1 patch and rebased over
>> master to
>> >>> >> avoid
>> >>> >> >> > these issues moving forward.
>> >>> >> >> >
>> >>> >> >> > PFA latest version
>> >>> >> >> >
>> >>> >> >> > On Thu, Nov 20, 2025 at 1:09 AM Tatsuo Ishii <
>> ishii(at)postgresql(dot)org
>> >>> >
>> >>> >> >> wrote:
>> >>> >> >> >
>> >>> >> >> >> Hi Nadav,
>> >>> >> >> >>
>> >>> >> >> >> Thank you for new patch.
>> >>> >> >> >> Unfortunately the patch did not apply to current master.
>> >>> >> >> >>
>> >>> >> >> >> $ git apply
>> >>> >> >> >>
>> ~/0001-Fix-multiple-issues-in-external-replication-delay-fe.patch
>> >>> >> >> >> error: patch failed:
>> >>> >> src/streaming_replication/pool_worker_child.c:694
>> >>> >> >> >> error: src/streaming_replication/pool_worker_child.c: patch
>> does
>> >>> not
>> >>> >> >> apply
>> >>> >> >> >>
>> >>> >> >> >> Maybe the patch is on top of your previous patch?
>> >>> >> >> >>
>> >>> >> >> >> Also I suggest to use "-v" option of "git format-patch" to
>> add the
>> >>> >> >> >> patch version number so that we can easily know which patch
>> is the
>> >>> >> >> >> latest.
>> >>> >> >> >>
>> >>> >> >> >> Best regards,
>> >>> >> >> >> --
>> >>> >> >> >> Tatsuo Ishii
>> >>> >> >> >> SRA OSS K.K.
>> >>> >> >> >> English: http://www.sraoss.co.jp/index_en/
>> >>> >> >> >> Japanese:http://www.sraoss.co.jp
>> >>> >> >> >>
>> >>> >> >> >> > Hi Tatsuo,
>> >>> >> >> >> >
>> >>> >> >> >> > Please see attached an updated version.
>> >>> >> >> >> >
>> >>> >> >> >> > thank you
>> >>> >> >> >> >
>> >>> >> >> >> > On Fri, Nov 7, 2025 at 2:07 AM Tatsuo Ishii <
>> >>> ishii(at)postgresql(dot)org>
>> >>> >> >> >> wrote:
>> >>> >> >> >> >
>> >>> >> >> >> >> > Sorry for that - thanks for the patch.
>> >>> >> >> >> >> >
>> >>> >> >> >> >> > Please find attached a new version
>> >>> >> >> >> >>
>> >>> >> >> >> >> Thanks for the new version. Unfortunately this time
>> regression
>> >>> >> test
>> >>> >> >> >> >> fails at:
>> >>> >> >> >> >>
>> >>> >> >> >> >> > Waiting for command timeout...
>> >>> >> >> >> >> > fail: command timeout not detected
>> >>> >> >> >> >>
>> >>> >> >> >> >> Attached is the pgpool.log.
>> >>> >> >> >> >>
>> >>> >> >> >> >> Best regards,
>> >>> >> >> >> >> --
>> >>> >> >> >> >> Tatsuo Ishii
>> >>> >> >> >> >> SRA OSS K.K.
>> >>> >> >> >> >> English: http://www.sraoss.co.jp/index_en/
>> >>> >> >> >> >> Japanese:http://www.sraoss.co.jp
>> >>> >> >> >> >>
>> >>> >> >> >> >> > On Mon, Nov 3, 2025 at 9:05 AM Tatsuo Ishii <
>> >>> >> ishii(at)postgresql(dot)org>
>> >>> >> >> >> >> wrote:
>> >>> >> >> >> >> >
>> >>> >> >> >> >> >> > thanks and sorry for the issues, please find attached
>> >>> updated
>> >>> >> >> >> version.
>> >>> >> >> >> >> >>
>> >>> >> >> >> >> >> No problem.
>> >>> >> >> >> >> >>
>> >>> >> >> >> >> >> This time the patch applies fine, no compiler warnings.
>> >>> >> However,
>> >>> >> >> >> >> >> regression test did not passed here (on Ubuntu 24 LTS if
>> >>> this
>> >>> >> >> >> >> >> matters). So I looked into
>> >>> >> >> >> >> >>
>> >>> >> src/test/regression/tests/041.external_replication_delay/test.sh a
>> >>> >> >> >> >> >> little bit and apply attached patch (test.sh.patch). It
>> >>> moved
>> >>> >> >> forward
>> >>> >> >> >> >> >> partially but failed at:
>> >>> >> >> >> >> >>
>> >>> >> >> >> >> >> fail: command execution failure not detected
>> >>> >> >> >> >> >>
>> >>> >> >> >> >> >> Please find attached
>> >>> >> >> >> >> >>
>> >>> >> >> >> >>
>> >>> >> >> >>
>> >>> >> >>
>> >>> >>
>> >>>
>> src/test/regression/tests/041.external_replication_delay/testdir/pgpool.log
>> >>> >> >> >> >> >> and
>> src/test/regression/log/041.external_replication_delay.
>> >>> >> >> >> >> >>
>> >>> >> >> >> >> >> Best regards,
>> >>> >> >> >> >> >> --
>> >>> >> >> >> >> >> Tatsuo Ishii
>> >>> >> >> >> >> >> SRA OSS K.K.
>> >>> >> >> >> >> >> English: http://www.sraoss.co.jp/index_en/
>> >>> >> >> >> >> >> Japanese:http://www.sraoss.co.jp
>> >>> >> >> >> >> >>
>> >>> >> >> >> >> >
>> >>> >> >> >> >> >
>> >>> >> >> >> >> > --
>> >>> >> >> >> >> > Nadav Shatz
>> >>> >> >> >> >> > Tailor Brands | CTO
>> >>> >> >> >> >>
>> >>> >> >> >> >
>> >>> >> >> >> >
>> >>> >> >> >> > --
>> >>> >> >> >> > Nadav Shatz
>> >>> >> >> >> > Tailor Brands | CTO
>> >>> >> >> >>
>> >>> >> >> >
>> >>> >> >> >
>> >>> >> >> > --
>> >>> >> >> > Nadav Shatz
>> >>> >> >> > Tailor Brands | CTO
>> >>> >> >>
>> >>> >> >
>> >>> >> >
>> >>> >> > --
>> >>> >> > Nadav Shatz
>> >>> >> > Tailor Brands | CTO
>> >>> >>
>> >>> >
>> >>> >
>> >>> > --
>> >>> > Nadav Shatz
>> >>> > Tailor Brands | CTO
>> >>>
>> >>
>> >>
>> >> --
>> >> Nadav Shatz
>> >> Tailor Brands | CTO
>>
>
>
> --
> Nadav Shatz
> Tailor Brands | CTO
<
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tatsuo Ishii | 2025-12-28 23:58:03 | Re: Proposal: recent access based routing for primary-replica setups |
| Previous Message | Nadav Shatz | 2025-12-28 12:21:15 | Re: Proposal: recent access based routing for primary-replica setups |