RFC: A constraint-enforcement layer, decoupling cross-partition constraints from indexing

From: "Greg Burd" <greg(at)burd(dot)me>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: RFC: A constraint-enforcement layer, decoupling cross-partition constraints from indexing
Date: 2026-06-09 19:46:19
Message-ID: f79b20eb-a0eb-4b36-9379-54dc376e144d@app.fastmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello hackers,

I'd like to float an idea and gauge appetite for exploring it. This is a request for comments, not a patch.

The problem, stated narrowly:

A UNIQUE or PRIMARY KEY constraint on a partitioned table must today include every partition-key column, because each per-partition index can only police uniqueness inside its own partition. The most-wanted feature this blocks is a cross-partition unique constraint on a non-partition-key column (e.g. partition by created_at, but keep email globally unique).

Prior art (and what each pays for it)
* The 2019 "Proposal: Global Index" [1] - a single parent-level index over storage-less parents.
* The 2022 HighGo "Global Unique Index" POC [2][3][4] - RELKIND_GLOBAL_INDEX, a merge-sort across all partitions at build, and on every write a probe into other partitions' indexes.
* Dilip Kumar's ongoing "Global Index" patch set [5] - a partition identifier embedded in each index tuple, a pg_index_partitions catalog, planner paths for the partitioned relation, and (currently) locking the partition hierarchy during DML.
* Postgres Pro's shipped gbtree AM [6][7] - a non-MVCC global structure carrying the PK as INCLUDE-style columns, with hash-bucket locking for uniqueness; reportedly ~1.6–1.7× slower writes with 100 partitions.

These are all full global indexes: they have to serve index scans, carry per-row entries with visibility semantics, expose planner paths, and be vacuumed. That is a large surface, and I think it's why the feature has been "almost there" for the better part of a decade.

The observation:

In the recent thread [8][9], the discussion already split the problem in two: for SELECTs we'd like to avoid locking every partition, and for uniqueness the conflict is found by probing one structure, after which only the single owning partition needs touching. In other words, enforcing a cross-partition constraint is a strictly smaller problem than maintaining a cross-partition index for reads. A unique constraint needs only: "does this key already exist anywhere, and if so where, transactionally?" It does not need ordered range scans, planner paths, or visibility-aware read access for arbitrary queries.

The proposal: a Constraint Management layer (two-tier)

Formalize constraint enforcement as its own abstraction, separate from indexing - call it a constraint method (CM), analogous to but distinct from the index AM. The planner/executor consult the CM before the row reaches the table AM and index AMs, or in cases where we are using the index for constraint enforcement as we do now.

Tier 1 - reuse an index when one already does the job. When the constraint's columns include the partition key (today's supported case) the CM simply delegates to existing local-index.

Tier 2 - a purpose-built enforcer when full indexing is overkill. For cross-partition uniqueness on a non-key column, register a lightweight enforcement structure that maintains just enough logged state to answer the membership question and locate the one conflicting partition. It is keyed only by the constraint columns; it stores no per-row scan payload, exposes no planner path, and is checked/updated from an ExecInsert/Update/Delete hook ahead of the heap and index inserts. Conceptually this is a global membership map (key -> owning-partition + heap TID) rather than a global secondary index.

The win is that Tier 2 lets us defer the hard parts the full-index proposals struggle with. Because the enforcer is consulted at the executor level (not from inside index AM code), we know which single partition to lock at exactly the moment we have a candidate conflict — addressing the "we only discover the partition while inside the AM" objection raised in the current thread. And because it carries no MVCC scan data, it sidesteps much of the vacuum/bloat conversation.

What it might look like in code:

A pg_constraint_method catalog and a CM descriptor with a handful of callbacks, roughly:
cm_check(values, snapshot) -> {ok | conflict(partition, tid)}
cm_insert(values, partition, tid, xid) / cm_delete(...)
cm_build(partitioned_rel) for initial population
cm_vacuum(...) / WAL redo for the enforcer's own state.

A reference Tier-2 enforcer: a WAL-logged, MVCC-aware ordered/hash relation keyed on the constraint columns, holding (partition_id, heap_tid, xmin/xmax) — large enough to enforce and to resolve the visibility of a would-be conflict, small enough to avoid being a second copy of the data.

Executor wiring: ExecInsert/ExecUpdate call cm_check after forming the tuple but before/around the heap insert (mirroring how speculative insertion + ON CONFLICT already work), locking only the partition returned by a positive cm_check.

Recovery: the enforcer is just another WAL-logged relation/fork, so crash recovery and physical replication come along for free; logical replication and ON CONFLICT are explicit follow-ups.

Why I think this is worth exploring over "just finish the global index":

The full-global-index patches are valuable and I'm not proposing to abandon them - Tier 1 can adopt them when they land, but the motivation for global indexes should be a requirement for index scans across partitioned tables, not constraint management. Based on what I understand, the real demand is for the constraint enforcement over partitioned tables, not the index scan. A CM layer lets us ship cross-partition uniqueness (and cross-partition exclusion constraints, partition-spanning FKs) sooner, with a smaller and more reviewable surface providing a "global" constraints system which should enable a wider adoption of partitioned tables.

Open questions I'd like input on:

- Is a separate constraint-method abstraction warranted, or should this be modeled as a degenerate "constraint-only" mode of the proposed global index (no read paths)?

- Concurrency: is a speculative-insertion-style protocol against the Tier-2 structure sufficient, or do we need a dedicated predicate-locking scheme for SSI correctness?

- Catalog and DDL: how should ALTER TABLE … ADD CONSTRAINT choose Tier 1 vs Tier 2, and should users be able to force it?

- Does deferring partition locks to executor-level CM checks interact badly with prepared plans / AcquireExecutorLocks?

- Is there appetite for a minimal prototype (uniqueness only, no FK/exclusion/etc.) as a discussion vehicle/proof-of-concept?

I'm happy to put together a WIP prototype of the uniqueness enforcer if there's interest in the direction. Feedback, especially on whether this should be folded into the existing global-index effort rather than standing alongside it, very welcome.

best,

-greg

[1] https://www.postgresql.org/message-id/CALtqXTcurqy1PKXzP9XO%3DofLLA5wBSo77BnUnYVEZpmcA3V0ag%40mail.gmail.com
[2] https://www.postgresql.org/message-id/184879c5306.12490ea581628934.7312528450011769010%40highgo.ca
[3] https://www.highgo.ca/2022/10/28/cross-partition-uniqueness-guarantee-with-global-unique-index/
[4] https://www.highgo.ca/2022/12/16/global-unique-index-attach-support-and-its-potential-deficiency/
[5] https://www.postgresql.org/message-id/CA%2BHiwqG%2BEizej9nCNcxSOfFyZ2i9Mhv%3Dzn%2Ba%2B4o-gwsvFz6EqQ%40mail.gmail.com
[6] https://postgrespro.com/docs/enterprise/current/pgpro-gbtree
[7] https://habr.com/en/companies/postgrespro/articles/948428/
[8] https://www.postgresql.org/message-id/CA%2BHiwqG%2BEizej9nCNcxSOfFyZ2i9Mhv%3Dzn%2Ba%2B4o-gwsvFz6EqQ%40mail.gmail.com
[9] https://www.postgresql.org/message-id/CAFiTN-tu1f0TL4C1CgRzBYkTrrhcYscc7Nz_LJ3xJDOZJGA6kA%40mail.gmail.com

Browse pgsql-hackers by date

  From Date Subject
Next Message Corey Huinker 2026-06-09 20:00:49 Re: postgres_fdw: Emit message when batch_size is reduced
Previous Message Ayush Tiwari 2026-06-09 19:33:45 Re: [Bug] Add the missing RTE_GRAPH_TABLE case to transformLockingClause()