From: | Nadav Shatz <nadav(at)tailorbrands(dot)com> |
---|---|
To: | Tatsuo Ishii <ishii(at)postgresql(dot)org> |
Cc: | pgpool-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Proposal: recent access based routing for primary-replica setups |
Date: | 2025-09-08 09:50:16 |
Message-ID: | CACeKOO23dZSC6okH_YtChEb49YFLuJrwRxC7aVaHzbdX4-fZJA@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgpool-hackers |
Hi Tatsuo,
Please find attached the 3 patch files (implementation, tests, docs) with
the updates we discussed.
What do you think?
Best,
On Mon, Sep 8, 2025 at 3:26 AM Tatsuo Ishii <ishii(at)postgresql(dot)org> wrote:
> Hi Nadav,
>
> > Hi Tatsuo,
> >
> > Thanks for getting back to me. Let me clarify the ordering concern and
> > provide an example to make it clearer:
> >
> > Currently, replication_delay_source_cmd executes without awareness of the
> > replica list or the order in which Pgpool loads them. For Aurora, since
> > we’re bypassing the internal DB tables and fetching lag data directly via
> > the AWS CloudWatch API, we need to ensure the returned lag values are
> > mapped to the correct instances.
> >
> > For example, assume Pgpool has the following configuration:
> >
> > primary: db-primary
> > replicas: db-replica-a, db-replica-b, db-replica-c
> >
> > If the command retrieves lag values [15, 120, 60] from CloudWatch, we
> need
> > to guarantee these are consistently mapped as:
> >
> >
> > -
> >
> > db-replica-a → 15ms
> > -
> >
> > db-replica-b → 120ms
> > -
> >
> > db-replica-c → 60ms
> >
> > Without explicitly passing the instance identifiers and their order to
> the
> > command, there’s a risk that mismatched ordering will cause Pgpool to
> make
> > incorrect routing decisions.
> >
> > To address this, I suggest extending replication_delay_source_cmd to
> accept
> > an ordered list of instance identifiers as arguments. This way, the
> command
> > can fetch the metrics in the same sequence Pgpool expects, ensuring
> > alignment between configuration and returned data.
>
> Thanks for the clarification. Previously I misunderstood that Aurora
> only provides "reader endpoint", which made me think your proposal to
> be impossible. But after some research , I found that Aurora also
> provides "cluster endpoint" which refers to each replica instance. So
> let me check if my understanding is
> correct. replication_delay_source_cmd will be invoked as:
>
> replication_delay_source_cmd db-replica-a db-replica-b db-replica-c
>
> > Would you agree this approach makes sense?
>
> Yes.
>
> > If so, I can provide an updated
> > patch to demonstrate how the command would handle ordered instance
> mapping.
>
> Thanks. That would be good.
>
> BTW, There are minor points regarding your previous patch. In the patch
>
> 083.external_replication_delay/
>
> is the test directory. This does not fit in with our test
> infrastructure tradition. Tests for new features should be added
> between 001 and 049. 050 and greater are reserved for tests for bug
> fixes. So at this point, 041 is appropreate (if other test for a new
> feature is added before your patch is committed, you need to adjust
> the number of course).
>
> You need to include a patch for documentation. You don't need to write
> Japanese doc (doc.ja). We will create it from the English document
> later on.
>
> Best regards,
> --
> Tatsuo Ishii
> SRA OSS K.K.
> English: http://www.sraoss.co.jp/index_en/
> Japanese:http://www.sraoss.co.jp
>
--
Nadav Shatz
Tailor Brands | CTO
Attachment | Content-Type | Size |
---|---|---|
0001-feat-add-external-command-replication-delay-source-f.patch | application/octet-stream | 16.1 KB |
0002-test-add-comprehensive-test-suite-for-external-repli.patch | application/octet-stream | 29.8 KB |
0003-doc-document-external-replication-delay-command-and-.patch | application/octet-stream | 4.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Tatsuo Ishii | 2025-09-08 12:02:52 | Re: Proposal: recent access based routing for primary-replica setups |
Previous Message | Tatsuo Ishii | 2025-09-08 00:26:28 | Re: Proposal: recent access based routing for primary-replica setups |