| From: | Tatsuo Ishii <ishii(at)postgresql(dot)org> |
|---|---|
| To: | nadav(at)tailorbrands(dot)com |
| Cc: | pgpool-hackers(at)lists(dot)postgresql(dot)org |
| Subject: | Re: Proposal: recent access based routing for primary-replica setups |
| Date: | 2025-12-24 02:46:36 |
| Message-ID: | 20251224.114636.2037102177156277413.ishii@postgresql.org |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgpool-hackers |
Hi Nadav,
Thank you for the updated patch. Unfortunately it continues to
fail. Looking into the code and log, it seems sr_check_worker fails at
here:
else if (WIFSIGNALED(status))
{
ereport(ERROR,
(errmsg("replication delay command terminated by signal %d: %s",
WTERMSIG(status), command)));
}
A strange thing is, the the signal number looks random: 6, 9, 62.... I
have no idea why this could happen. If I change ERROR to DEBUG1, the
test sometimes succeeds and sometimes fails. I will look into this.
Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
> Hi Tatsuo,
>
> Thank you for the note.
>
> I've removed the docker stuff. started working in an ubuntu 24 VM to match
> the setup. hopefully the results will be better, had so many issues
> compiling and testing before that stuff wasn't properly formulated.
>
> Attaching the latest patch.
>
> this is what i'm seeing:
> adav(at)lima-dev:/src/pgpool2/src/test/regression$ PGHOST=/tmp ./regress.sh -p
> /usr/bin 041.external_replication_delay
> creating pgpool-II temporary installation ...
> moving pgpool_setup to temporary installation path ...
> moving watchdog_setup to temporary installation path ...
> using pgpool-II at /src/pgpool2/src/test/regression/temp/installed
> *************************
> REGRESSION MODE : install
> Pgpool-II version : pgpool-II version 4.8devel (mitsukakeboshi)
> Pgpool-II install path : /src/pgpool2/src/test/regression/temp/installed
> PostgreSQL bin : /usr/lib/postgresql/16/bin
> PostgreSQL Major version : 16
> pgbench : /usr/lib/postgresql/16/bin/pgbench
> PostgreSQL jdbc :
> /usr/local/pgsql/share/postgresql-9.2-1003.jdbc4.jar
> *************************
> testing 041.external_replication_delay...ok.
> out of 1 ok:1 failed:0 timeout:0
>
>
>
> On Tue, Dec 23, 2025 at 10:46 AM Tatsuo Ishii <ishii(at)postgresql(dot)org> wrote:
>
>> > Hi Tatsuo,
>> >
>> > I'km running into issues testing this and have created a full docker
>> > compose setup - can you please point me to up to date guides on the best
>> > way to run the tests so i know we're doing it the same way?
>> >
>> > Thank you for all your help!
>>
>> I have run the regression test on the Pgpool-II master branch on my
>> Ubuntu 24 box.
>>
>> cd pgpool2/src/test/regression
>> ./regress.sh 041
>>
>> This time I noticed:
>>
>> - The patch does not named with version number
>> - The patch creates .dockerignore and docker/ directory.
>>
>> Are they intended? I am asking because they are different from the
>> previous version.
>>
>> > On Tue, Dec 23, 2025 at 2:13 AM Tatsuo Ishii <ishii(at)postgresql(dot)org>
>> wrote:
>> >
>> >> > I think everything is passing now. new version attached.
>> >>
>> >> Unfortunately Test1 did not pass.
>> >>
>> >> === Test1: Basic external command with integer millisecond values ===
>> >> waiting for server to start....1438600 2025-12-23 09:09:48.337 JST LOG:
>> >> redirecting log output to logging collector process
>> >> 1438600 2025-12-23 09:09:48.337 JST HINT: Future log output will appear
>> >> in directory "log".
>> >> done
>> >> server started
>> >> waiting for server to start....1438617 2025-12-23 09:09:48.443 JST LOG:
>> >> redirecting log output to logging collector process
>> >> 1438617 2025-12-23 09:09:48.443 JST HINT: Future log output will appear
>> >> in directory "log".
>> >> done
>> >> server started
>> >> waiting for server to start....1438634 2025-12-23 09:09:48.561 JST LOG:
>> >> redirecting log output to logging collector process
>> >> 1438634 2025-12-23 09:09:48.561 JST HINT: Future log output will appear
>> >> in directory "log".
>> >> done
>> >> server started
>> >> CREATE TABLE
>> >> Waiting for sr_check to run...
>> >> Command executed after 1 seconds
>> >> node_id | hostname | port | status | pg_status | lb_weight | role
>> |
>> >> pg_role | select_cnt | load_balance_node | replication_delay |
>> >> replication_state | replication_sync_state | last_status_change
>> >>
>> >>
>> ---------+-----------+-------+--------+-----------+-----------+---------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
>> >> 0 | localhost | 11002 | up | up | 0.333333 | primary
>> |
>> >> primary | 0 | true | 0 |
>> >> | | 2025-12-23 09:09:49
>> >> 1 | localhost | 11003 | up | up | 0.333333 | standby
>> |
>> >> standby | 0 | false | 0 |
>> >> | | 2025-12-23 09:09:49
>> >> 2 | localhost | 11004 | up | up | 0.333333 | standby
>> |
>> >> standby | 0 | false | 0 |
>> >> | | 2025-12-23 09:09:49
>> >> (3 rows)
>> >>
>> >> fail: external command delay logging not found
>> >>
>> >> > On Mon, Nov 24, 2025 at 9:41 AM Tatsuo Ishii <ishii(at)postgresql(dot)org>
>> >> wrote:
>> >> >
>> >> >> Thank you for updating the patch! This time the patch applies without
>> >> >> any issue and compiles fine. Unfortunately regression test failed.
>> >> >>
>> >> >> testing 041.external_replication_delay...failed.
>> >> >>
>> >> >> From the regression log, it seems Test7 failed.
>> >> >>
>> >> >>
>> >>
>> ------------------------------------------------------------------------------
>> >> >> === Test7: Command timeout handling ===
>> >> >> waiting for server to start....411181 2025-11-24 16:31:05.244 JST
>> LOG:
>> >> >> redirecting log output to logging collector process
>> >> >> 411181 2025-11-24 16:31:05.244 JST HINT: Future log output will
>> appear
>> >> in
>> >> >> directory "log".
>> >> >> done
>> >> >> server started
>> >> >> waiting for server to start....411196 2025-11-24 16:31:05.352 JST
>> LOG:
>> >> >> redirecting log output to logging collector process
>> >> >> 411196 2025-11-24 16:31:05.352 JST HINT: Future log output will
>> appear
>> >> in
>> >> >> directory "log".
>> >> >> done
>> >> >> server started
>> >> >> waiting for server to start....411213 2025-11-24 16:31:05.461 JST
>> LOG:
>> >> >> redirecting log output to logging collector process
>> >> >> 411213 2025-11-24 16:31:05.461 JST HINT: Future log output will
>> appear
>> >> in
>> >> >> directory "log".
>> >> >> done
>> >> >> server started
>> >> >> Waiting for command timeout...
>> >> >> fail: command timeout not detected
>> >> >>
>> >> >>
>> >>
>> ------------------------------------------------------------------------------
>> >> >>
>> >> >> Attached is the pgpool.log. If you need more info, please let me
>> know.
>> >> >>
>> >> >> Best regards,
>> >> >> --
>> >> >> Tatsuo Ishii
>> >> >> SRA OSS K.K.
>> >> >> English: http://www.sraoss.co.jp/index_en/
>> >> >> Japanese:http://www.sraoss.co.jp
>> >> >>
>> >> >>
>> >> >> > Hi Tatsuo,
>> >> >> >
>> >> >> > Sorry again, this was due to the separation of 2 patches and i only
>> >> sent
>> >> >> > the one.
>> >> >> >
>> >> >> > I've merged it into 1 commit and 1 patch and rebased over master to
>> >> avoid
>> >> >> > these issues moving forward.
>> >> >> >
>> >> >> > PFA latest version
>> >> >> >
>> >> >> > On Thu, Nov 20, 2025 at 1:09 AM Tatsuo Ishii <ishii(at)postgresql(dot)org
>> >
>> >> >> wrote:
>> >> >> >
>> >> >> >> Hi Nadav,
>> >> >> >>
>> >> >> >> Thank you for new patch.
>> >> >> >> Unfortunately the patch did not apply to current master.
>> >> >> >>
>> >> >> >> $ git apply
>> >> >> >> ~/0001-Fix-multiple-issues-in-external-replication-delay-fe.patch
>> >> >> >> error: patch failed:
>> >> src/streaming_replication/pool_worker_child.c:694
>> >> >> >> error: src/streaming_replication/pool_worker_child.c: patch does
>> not
>> >> >> apply
>> >> >> >>
>> >> >> >> Maybe the patch is on top of your previous patch?
>> >> >> >>
>> >> >> >> Also I suggest to use "-v" option of "git format-patch" to add the
>> >> >> >> patch version number so that we can easily know which patch is the
>> >> >> >> latest.
>> >> >> >>
>> >> >> >> Best regards,
>> >> >> >> --
>> >> >> >> Tatsuo Ishii
>> >> >> >> SRA OSS K.K.
>> >> >> >> English: http://www.sraoss.co.jp/index_en/
>> >> >> >> Japanese:http://www.sraoss.co.jp
>> >> >> >>
>> >> >> >> > Hi Tatsuo,
>> >> >> >> >
>> >> >> >> > Please see attached an updated version.
>> >> >> >> >
>> >> >> >> > thank you
>> >> >> >> >
>> >> >> >> > On Fri, Nov 7, 2025 at 2:07 AM Tatsuo Ishii <
>> ishii(at)postgresql(dot)org>
>> >> >> >> wrote:
>> >> >> >> >
>> >> >> >> >> > Sorry for that - thanks for the patch.
>> >> >> >> >> >
>> >> >> >> >> > Please find attached a new version
>> >> >> >> >>
>> >> >> >> >> Thanks for the new version. Unfortunately this time regression
>> >> test
>> >> >> >> >> fails at:
>> >> >> >> >>
>> >> >> >> >> > Waiting for command timeout...
>> >> >> >> >> > fail: command timeout not detected
>> >> >> >> >>
>> >> >> >> >> Attached is the pgpool.log.
>> >> >> >> >>
>> >> >> >> >> Best regards,
>> >> >> >> >> --
>> >> >> >> >> Tatsuo Ishii
>> >> >> >> >> SRA OSS K.K.
>> >> >> >> >> English: http://www.sraoss.co.jp/index_en/
>> >> >> >> >> Japanese:http://www.sraoss.co.jp
>> >> >> >> >>
>> >> >> >> >> > On Mon, Nov 3, 2025 at 9:05 AM Tatsuo Ishii <
>> >> ishii(at)postgresql(dot)org>
>> >> >> >> >> wrote:
>> >> >> >> >> >
>> >> >> >> >> >> > thanks and sorry for the issues, please find attached
>> updated
>> >> >> >> version.
>> >> >> >> >> >>
>> >> >> >> >> >> No problem.
>> >> >> >> >> >>
>> >> >> >> >> >> This time the patch applies fine, no compiler warnings.
>> >> However,
>> >> >> >> >> >> regression test did not passed here (on Ubuntu 24 LTS if
>> this
>> >> >> >> >> >> matters). So I looked into
>> >> >> >> >> >>
>> >> src/test/regression/tests/041.external_replication_delay/test.sh a
>> >> >> >> >> >> little bit and apply attached patch (test.sh.patch). It
>> moved
>> >> >> forward
>> >> >> >> >> >> partially but failed at:
>> >> >> >> >> >>
>> >> >> >> >> >> fail: command execution failure not detected
>> >> >> >> >> >>
>> >> >> >> >> >> Please find attached
>> >> >> >> >> >>
>> >> >> >> >>
>> >> >> >>
>> >> >>
>> >>
>> src/test/regression/tests/041.external_replication_delay/testdir/pgpool.log
>> >> >> >> >> >> and src/test/regression/log/041.external_replication_delay.
>> >> >> >> >> >>
>> >> >> >> >> >> Best regards,
>> >> >> >> >> >> --
>> >> >> >> >> >> Tatsuo Ishii
>> >> >> >> >> >> SRA OSS K.K.
>> >> >> >> >> >> English: http://www.sraoss.co.jp/index_en/
>> >> >> >> >> >> Japanese:http://www.sraoss.co.jp
>> >> >> >> >> >>
>> >> >> >> >> >
>> >> >> >> >> >
>> >> >> >> >> > --
>> >> >> >> >> > Nadav Shatz
>> >> >> >> >> > Tailor Brands | CTO
>> >> >> >> >>
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > --
>> >> >> >> > Nadav Shatz
>> >> >> >> > Tailor Brands | CTO
>> >> >> >>
>> >> >> >
>> >> >> >
>> >> >> > --
>> >> >> > Nadav Shatz
>> >> >> > Tailor Brands | CTO
>> >> >>
>> >> >
>> >> >
>> >> > --
>> >> > Nadav Shatz
>> >> > Tailor Brands | CTO
>> >>
>> >
>> >
>> > --
>> > Nadav Shatz
>> > Tailor Brands | CTO
>>
>
>
> --
> Nadav Shatz
> Tailor Brands | CTO
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tatsuo Ishii | 2025-12-25 07:26:00 | Feature: reduce sync messages |
| Previous Message | Nadav Shatz | 2025-12-23 14:03:13 | Re: Proposal: recent access based routing for primary-replica setups |