Re: [PATCH] Add pg_get_table_ddl() to reconstruct CREATE TABLE statements

From: Rui Zhao <zhaorui126(at)gmail(dot)com>
To: Akshay Joshi <akshay(dot)joshi(at)enterprisedb(dot)com>
Cc: Zsolt Parragi <zsolt(dot)parragi(at)percona(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: [PATCH] Add pg_get_table_ddl() to reconstruct CREATE TABLE statements
Date: 2026-07-04 16:35:44
Message-ID: CAHWVJhFh7R6uTMg_TiCPc4zZRYV_ak=T9gEsTmVspxYxiydXcA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Akshay,

Re-tested v16 on current master -- builds clean and the recent fixes hold up
(self-ref FK after PK, no duplicate partition-child index, cross-schema
partition child qualified, schema_qualified => true consistent). A few
issues, in severity order; several are places where pg_dump already does the
right thing.

1. Clause order: TABLESPACE is emitted before ON COMMIT (ddlutils.c:2042 vs
2054), but the grammar is "... OptWith OnCommitOption OptTableSpace" -- ON
COMMIT must come first, so a temp table with both clauses is non-replayable:

CREATE TABLESPACE ts1 LOCATION '/path/to/dir';
CREATE TEMP TABLE tt (a int) ON COMMIT DROP TABLESPACE ts1;
SELECT d FROM pg_get_table_ddl('tt'::regclass, owner => false) d;
-- CREATE TEMPORARY TABLE tt (a integer) TABLESPACE ts1 ON COMMIT DROP;

CREATE TEMPORARY TABLE tt (a integer) TABLESPACE ts1 ON COMMIT DROP;
-- ERROR: syntax error at or near "ON"

Swapping the two blocks so ON COMMIT precedes TABLESPACE fixes it.

2. Child-default override recurses (missing ONLY). emit_child_default_overrides
emits the inherited-column default without ONLY (ddlutils.c:2118), so SET
DEFAULT recurses into the table's children and reconstructing one table
silently rewrites another:

CREATE TABLE dpar (x int);
CREATE TABLE dch () INHERITS (dpar);
CREATE TABLE dgc () INHERITS (dch);
ALTER TABLE ONLY dch ALTER COLUMN x SET DEFAULT 5;
ALTER TABLE ONLY dgc ALTER COLUMN x SET DEFAULT 10;

SELECT d FROM pg_get_table_ddl('dch'::regclass, owner => false) d;
-- ALTER TABLE dch ALTER COLUMN x SET DEFAULT 5; -- no ONLY

ALTER TABLE dch ALTER COLUMN x SET DEFAULT 5; -- replay this line
-- => dgc's default is now 5, not 10

pg_dump uses ALTER TABLE ONLY here for exactly this reason.

Partitioned tables hit this especially easily -- any partitioned table with a
column default emits a redundant, ONLY-less SET DEFAULT for every partition
that merely inherits it:

CREATE TABLE p (id int, amt int DEFAULT 5) PARTITION BY LIST (id);
CREATE TABLE p_a PARTITION OF p FOR VALUES IN (1);

SELECT d FROM pg_get_table_ddl('p'::regclass, owner => false) d;
-- CREATE TABLE public.p (id integer, amt integer DEFAULT 5)
PARTITION BY LIST (id);
-- CREATE TABLE public.p_a PARTITION OF public.p FOR VALUES IN (1);
-- ALTER TABLE public.p_a ALTER COLUMN amt SET DEFAULT 5; --
redundant, no ONLY

p_a already inherits amt's default from the PARTITION OF, so the third line is
redundant; pg_dump instead keeps the default inline on the child and attaches
with ALTER TABLE ONLY ... ATTACH PARTITION. A full-hierarchy replay
self-corrects (each child's own SET DEFAULT runs last), but a partial replay or
a lone emitted statement does not. The commit message lists "child-local
DEFAULT overrides on inheritance/partition children" as supported, so this is
in scope. (More generally the patch never emits ONLY anywhere; ADD CONSTRAINT,
where CHECK / NOT NULL also recurse to children, is worth the same audit.)

3. Inherited-only NOT NULL emitted as local. When a child redeclares an
inherited column (attislocal) without restating NOT NULL, the constraint is
inherited-only (conislocal = false); collect_local_not_null skips it, but
append_column_defs keys off att->attnotnull and emits a bare NOT NULL anyway:

CREATE TABLE par (a int NOT NULL);
CREATE TABLE chld (a int) INHERITS (par); -- 'a' redeclared, no NOT NULL

SELECT d FROM pg_get_table_ddl('chld'::regclass, owner => false) d;
-- CREATE TABLE public.chld (a integer NOT NULL) INHERITS (public.par);

On replay the child now owns the constraint (conislocal flips false -> true,
name regenerates par_a_not_null -> chld_a_not_null), so a later
"ALTER TABLE par ALTER a DROP NOT NULL" cascades to the original child but not
the reconstructed one. pg_dump emits "a integer" with no NOT NULL here,
suppressing it via notnull_islocal (pg_dump.c ~9916). The docs describe this as
the intended behavior -- "Inherited columns and constraints ... are not
duplicated on inheritance children or partitions" -- so it reads as a
documented contract the code doesn't quite meet. The inline CHECK path in this
patch already filters on conislocal; the NOT NULL path could do the same.

4. Typed-table STORAGE / COMPRESSION not emitted -- intended? append_column_defs
emits per-column STORAGE for ordinary tables, but the typed-table path
(append_typed_column_overrides) only handles DEFAULT / NOT NULL / CHECK, so a
storage override on a typed table is not reproduced:

CREATE TYPE mytype AS (a int, b text);
CREATE TABLE typed_t OF mytype;
ALTER TABLE typed_t ALTER COLUMN b SET STORAGE external;

SELECT d FROM pg_get_table_ddl('typed_t'::regclass, owner => false) d;
-- CREATE TABLE public.typed_t OF public.mytype; -- STORAGE
not emitted

The docs scope the typed-table form to overrides for "defaults, NOT NULL, and
CHECK", so this may well be deliberate. But STORAGE is listed in the general
per-column coverage, and pg_dump does emit it (ALTER TABLE ONLY ... ALTER COLUMN
b SET STORAGE EXTERNAL) -- so it seems worth confirming the omission is
intentional rather than an oversight.

5. Minor: is_auto (ddlutils.c:1381) and the identity SEQUENCE NAME check (1681)
rebuild the expected auto-name with snprintf, but the backend uses
makeObjectName(), which truncates name1/name2 to fit NAMEDATALEN and never the
label -- so for long names the "_not_null" / "_seq" suffix is dropped and the
checks misfire, emitting a name a short-named table would omit:

CREATE TABLE aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
(bbbbbbbbbbbbbbbbbbbb int GENERATED ALWAYS AS IDENTITY);

SELECT d FROM pg_get_table_ddl(
'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'::regclass,
owner => false) d;
-- CREATE TABLE public.aaaa...(50) (bbbb...(20) integer
-- GENERATED ALWAYS AS IDENTITY (SEQUENCE NAME public.aaaa..._seq)
-- CONSTRAINT aaaa..._not_null NOT NULL);

The default-omission convention in the commit message lists "the auto-generated
identity sequence name" among the clauses meant to be dropped, which is exactly
what misfires here for long names. It still replays (the names are real), so
this one is cosmetic; comparing against makeObjectName(relname, colname,
"not_null" / "seq") makes both sides agree.

Everything else looks good.

Thanks,
Rui

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Henson Choi 2026-07-04 16:39:43 Re: Row pattern recognition
Previous Message Tom Lane 2026-07-04 15:51:10 Re: PROPERTY GRAPH pg_dump ACL minimization