Re: cataloguing NOT NULL constraints

From: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: cataloguing NOT NULL constraints
Date: 2023-08-31 10:02:39
Message-ID: 20230831100239.gzjmiovzabx2diuo@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2023-Mar-29, Peter Eisentraut wrote:

> On 27.03.23 15:55, Peter Eisentraut wrote:
> > The information schema should be updated.  I think the following views:
> >
> > - CHECK_CONSTRAINTS
> > - CONSTRAINT_COLUMN_USAGE
> > - DOMAIN_CONSTRAINTS
> > - TABLE_CONSTRAINTS
> >
> > It looks like these have no test coverage; maybe that could be addressed
> > at the same time.
>
> Here are patches for this. I haven't included the expected files for the
> tests; this should be checked again that output is correct or the changes
> introduced by this patch set are as expected.
>
> The reason we didn't have tests for this before was probably in part because
> the information schema made up names for not-null constraints involving
> OIDs, so the test wouldn't have been stable.
>
> Feel free to integrate this, or we can add it on afterwards.

I'm eyeing patch 0002 here. I noticed that in view check_constraints it
defines the not-null constraint definition as substrings over the
pg_get_constraintdef() function[q1], so I wondered whether it might be
better to join to pg_attribute instead. I see two options:

1. add a scalar subselect in the select list for each constraint [q2]
2. add a LEFT JOIN to pg_attribute to the main FROM list [q3]
ON con.conrelid=att.attrelid AND con.conkey[1] = con.attrelid

With just the regression test tables in place, these forms are all
pretty much the same in execution time. I then created 20k tables with
6 columns each and NOT NULL constraint on each column[4]. That's not a
huge amount but it's credible for a medium-size database with a bunch of
partitions (it's amazing what passes for a medium-size database these
days). I was surprised to find out that q3 (~130ms) is three times
faster than q2 (~390ms), which is in turn more than twice faster than
your proposed q1 (~870ms). So unless you have another reason to prefer
it, I think we should use q3 here.

In constraint_column_usage, you're adding a relkind to the catalog scan
that goes through pg_depend for CHECK constraints. Here we can rely on
a simple conkey[1] check and a separate UNION ALL arm[q5]; this is also
faster when there are many tables.

The third view definition looks ok. It's certainly very nice to be able
to delete XXX comments there.

[q1]
SELECT current_database()::information_schema.sql_identifier AS constraint_catalog,
rs.nspname::information_schema.sql_identifier AS constraint_schema,
con.conname::information_schema.sql_identifier AS constraint_name,
CASE con.contype
WHEN 'c'::"char" THEN "left"(SUBSTRING(pg_get_constraintdef(con.oid) FROM 8), '-1'::integer)
WHEN 'n'::"char" THEN SUBSTRING(pg_get_constraintdef(con.oid) FROM 10) || ' IS NOT NULL'::text
ELSE NULL::text
END::information_schema.character_data AS check_clause
FROM pg_constraint con
LEFT JOIN pg_namespace rs ON rs.oid = con.connamespace
LEFT JOIN pg_class c ON c.oid = con.conrelid
LEFT JOIN pg_type t ON t.oid = con.contypid
WHERE pg_has_role(COALESCE(c.relowner, t.typowner), 'USAGE'::text) AND (con.contype = ANY (ARRAY['c'::"char", 'n'::"char"]));

[q2]
SELECT current_database()::information_schema.sql_identifier AS constraint_catalog,
rs.nspname::information_schema.sql_identifier AS constraint_schema,
con.conname::information_schema.sql_identifier AS constraint_name,
CASE con.contype
WHEN 'c'::"char" THEN "left"(SUBSTRING(pg_get_constraintdef(con.oid) FROM 8), '-1'::integer)
WHEN 'n'::"char" THEN FORMAT('CHECK (%s IS NOT NULL)',
(SELECT attname FROM pg_attribute WHERE attrelid = conrelid AND attnum = conkey[1]))
ELSE NULL::text
END::information_schema.character_data AS check_clause
FROM pg_constraint con
LEFT JOIN pg_namespace rs ON rs.oid = con.connamespace
LEFT JOIN pg_class c ON c.oid = con.conrelid
LEFT JOIN pg_type t ON t.oid = con.contypid
WHERE pg_has_role(COALESCE(c.relowner, t.typowner), 'USAGE'::text) AND (con.contype = ANY (ARRAY['c'::"char", 'n'::"char"]));

[q3]
SELECT current_database()::information_schema.sql_identifier AS constraint_catalog,
rs.nspname::information_schema.sql_identifier AS constraint_schema,
con.conname::information_schema.sql_identifier AS constraint_name,
CASE con.contype
WHEN 'c'::"char" THEN "left"(SUBSTRING(pg_get_constraintdef(con.oid) FROM 8), '-1'::integer)
WHEN 'n'::"char" THEN FORMAT('CHECK (%s IS NOT NULL)', at.attname)
ELSE NULL::text
END::information_schema.character_data AS check_clause
FROM pg_constraint con
LEFT JOIN pg_namespace rs ON rs.oid = con.connamespace
LEFT JOIN pg_class c ON c.oid = con.conrelid
LEFT JOIN pg_type t ON t.oid = con.contypid
LEFT JOIN pg_attribute at ON (con.conrelid = at.attrelid AND con.conkey[1] = at.attnum)
WHERE pg_has_role(COALESCE(c.relowner, t.typowner), 'USAGE'::text) AND (con.contype = ANY (ARRAY['c'::"char", 'n'::"char"]));

[4]
do $$ begin for i in 0 .. 20000 loop
execute format('create table t_%s (a1 int not null, a2 int not null, a3 int not null,
a4 int not null, a5 int not null, a6 int not null);',
i);
if i % 1000 = 0 then commit; end if;
end loop; end $$;

[q5]
SELECT CAST(current_database() AS sql_identifier) AS table_catalog,
CAST(tblschema AS sql_identifier) AS table_schema,
CAST(tblname AS sql_identifier) AS table_name,
CAST(colname AS sql_identifier) AS column_name,
CAST(current_database() AS sql_identifier) AS constraint_catalog,
CAST(cstrschema AS sql_identifier) AS constraint_schema,
CAST(cstrname AS sql_identifier) AS constraint_name

FROM (
/* check constraints */
SELECT DISTINCT nr.nspname, r.relname, r.relowner, a.attname, nc.nspname, c.conname
FROM pg_namespace nr, pg_class r, pg_attribute a, pg_depend d, pg_namespace nc, pg_constraint c
WHERE nr.oid = r.relnamespace
AND r.oid = a.attrelid
AND d.refclassid = 'pg_catalog.pg_class'::regclass
AND d.refobjid = r.oid
AND d.refobjsubid = a.attnum
AND d.classid = 'pg_catalog.pg_constraint'::regclass
AND d.objid = c.oid
AND c.connamespace = nc.oid
AND c.contype = 'c'
AND r.relkind IN ('r', 'p')
AND NOT a.attisdropped

UNION ALL

/* not-null constraints */
SELECT DISTINCT nr.nspname, r.relname, r.relowner, a.attname, nc.nspname, c.conname
FROM pg_namespace nr, pg_class r, pg_attribute a, pg_namespace nc, pg_constraint c
WHERE nr.oid = r.relnamespace
AND r.oid = a.attrelid
AND r.oid = c.conrelid
AND a.attnum = c.conkey[1]
AND c.connamespace = nc.oid
AND c.contype = 'n'
AND r.relkind in ('r', 'p')
AND not a.attisdropped

UNION ALL

/* unique/primary key/foreign key constraints */
SELECT nr.nspname, r.relname, r.relowner, a.attname, nc.nspname, c.conname
FROM pg_namespace nr, pg_class r, pg_attribute a, pg_namespace nc,
pg_constraint c
WHERE nr.oid = r.relnamespace
AND r.oid = a.attrelid
AND nc.oid = c.connamespace
AND r.oid = CASE c.contype WHEN 'f' THEN c.confrelid ELSE c.conrelid END
AND a.attnum = ANY (CASE c.contype WHEN 'f' THEN c.confkey ELSE c.conkey END)
AND NOT a.attisdropped
AND c.contype IN ('p', 'u', 'f')
AND r.relkind IN ('r', 'p')

) AS x (tblschema, tblname, tblowner, colname, cstrschema, cstrname)

WHERE pg_has_role(x.tblowner, 'USAGE') ;

--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2023-08-31 10:12:42 Re: Sync scan & regression tests
Previous Message Alexander Lakhin 2023-08-31 10:00:00 Re: cataloguing NOT NULL constraints