From: | Jishnu Sygal <jishnu(dot)sygal(at)gmail(dot)com> |
---|---|
To: | pgsql-admin(at)lists(dot)postgresql(dot)org |
Subject: | [Architecture] Multi-Environment pg_cron for Automated Partition Management |
Date: | 2025-09-17 15:04:26 |
Message-ID: | CAE2yxeODNjNZmJw8-FzskhGKwXn7x3H1tvnJonHTQb40yz5TCQ@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin |
Hello Postgres Community,
I am writing to get your expert opinion on a proposed architecture for
managing pg_partman automated partitioning across a multi-environment
PostgreSQL setup. While this solution is primarily for cost savings and is
used in our non-production environments, *stability is a must*. Our core
goal is to create a universal scheduling solution that works identically
across cloud-managed databases (AWS RDS, Google Cloud SQL, Azure Database),
as well as on-premises shared and dedicated hosts.
Business Context & Core Requirements
We need to consistently automate pg_partman maintenance tasks across our
PostgreSQL 17 environments. The primary challenge is maintaining a single,
identical application interface so that the same SQL commands work without
modification, regardless of the deployment model. A key requirement is that
our in-house database schema migration tools will be used to roll out SQL
scripts that directly handle job maintenance and alterations during releases.
Our jobs are low frequency (monthly/quarterly/annually) but are critically
important, as failures can cause serious operational issues.
Our current constraints include:
-
PostgreSQL 17 (with a planned migration to 18).
-
A mix of AWS RDS, Google Cloud SQL, Azure Database, and on-premises
deployments.
-
Shared hosts with 50-60 databases and dedicated hosts with a single
database.
Our Proposed Architecture
As the pg_cron extension requires a dedicated database to schedule jobs for
other databases, we have designed an abstraction layer to address our
multi-environment challenges. Our architecture places pg_cron in a
designated database and uses dblink to execute partition maintenance jobs
in the target application databases.
Key elements of this design are:
-
*Universal API Layer:* Every application database would have an
identical cron schema with wrapper functions.
-
*Identical Application Interface:* Applications use the exact same SQL
statement, for example: SELECT
cron.schedule('monthly_partition_maintenance', '0 2 1 * *', 'SELECT
partman.run_maintenance_proc()');
-
*Environment-Adaptive Communication:* dblink is used for multi-database
environments, while direct calls are used for single-database setups.
-
*PostgreSQL 18 Future-Proofing:* We plan to leverage SCRAM pass-through
authentication for dblink to eliminate the need for storing credentials.
Technical Questions & Concerns
We have detailed several technical questions below and would greatly
appreciate your insights and validation.
*1. Architecture Validation*
-
Is this universal abstraction layer a sound architectural pattern for
managing pg_partman across diverse environments, especially given our
focus on using in-house tools for job management?
-
How should the architecture specifically adapt for AWS RDS vs. Google
Cloud SQL vs. Azure Database, especially concerning their limitations on
pg_cron or connection management?
-
Does this design scale appropriately from a single dedicated database to
a shared host with 50-60 databases and 400-900 partition jobs?
*2. Connection & Performance*
-
Is our proposed temporary-connection-with-guaranteed-cleanup pattern
robust enough for production? We are concerned about dblink connection
exhaustion risks and connection leaks, especially in shared environments
with many jobs.
-
Given the low frequency of jobs, will the dblink overhead be
significant? Are there specific performance optimizations we should
consider for cloud-managed services vs. on-premises?
*3. Alternative Approaches*
-
Should we abandon this PostgreSQL-native approach and instead consider
cloud-native job schedulers (e.g, AWS EventBridge, Google Cloud
Scheduler, Azure Logic Apps) to trigger maintenance jobs?
-
Are there existing enterprise scheduling solutions that are
purpose-built for this kind of multi-cloud, on-premises PostgreSQL
automation?
We are most concerned about connection leaks, unnoticed maintenance
failures, and ensuring a single, identical SQL interface is truly
achievable. We believe this solution could significantly simplify our
operations, but we want to validate its viability and get feedback on
potential pitfalls from the community.
Thank you for your time and any insights you can provide.
Best Regards,
Jishnu Sygal
From | Date | Subject | |
---|---|---|---|
Next Message | bertrand HARTWIG | 2025-09-17 18:26:49 | Re: [Architecture] Multi-Environment pg_cron for Automated Partition Management |
Previous Message | kaido vaikla | 2025-09-17 09:27:33 | Re: Allow connections by IP address? |