Quick Links

plpgsql versus domains

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	pgsql-hackers(at)postgreSQL(dot)org
Subject:	plpgsql versus domains
Date:	2015-02-26 17:31:46
Message-ID:	12503.1424971906@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

At the behest of Salesforce, I've been looking into improving plpgsql's
handling of variables of domain types, which is currently pretty awful.
It's really slow, because any assignment to a domain variable goes through
the slow double-I/O-conversion path in exec_cast_value, even if no actual
work is needed. And I noticed that the domain's constraints get looked up
only once per function per session; for example this script gets
unexpected results:

create domain di as int;

create function foo1(int) returns di as $$
declare d di;
begin
d := $1;
return d;
end
$$ language plpgsql immutable;

select foo1(0); -- call once to set up cache

alter domain di add constraint pos check (value > 0);

select 0::di; -- fails, as expected

select foo1(0); -- should fail, does not

\c -

select foo1(0); -- now it fails

The reason for this is that domain_in looks up the domain's constraints
and caches them under the calling FmgrInfo struct. That behavior was
designed to ensure we'd do just one constraint-lookup per query, which
I think is reasonable. But plpgsql sets up an FmgrInfo in the variable's
PLpgSQL_type record, which persists for the whole session unless the
function's parse tree is flushed for some reason. So the constraints
only get looked up once.

The rough sketch I have in mind for fixing this is:

1. Arrange to cache the constraints for domain types in typcache.c,
so as to reduce the number of trips to the actual catalogs. I think
typcache.c will have to flush these cache items upon seeing any sinval
change in pg_constraint, but that's still cheaper than searching
pg_constraint for every query.

2. In plpgsql, explicitly model a domain type as being its base type
plus some constraints --- that is, stop depending on domain_in() to
enforce the constraints, and do it ourselves. This would mean that
the FmgrInfo in a PLpgSQL_type struct would reference the base type's
input function rather than domain_in, so we'd not have to be afraid
to cache that. The constraints would be represented as a separate list,
which we'd arrange to fetch from the typcache once per transaction.
(I think checking for new constraints once per transaction is sufficient
for reasonable situations, though I suppose that could be argued about.)
The advantage of this approach is first that we could avoid an I/O
conversion when the incoming value to be assigned matches the domain's
base type, which is the typical case; and second that a domain without
any CHECK constraints would become essentially free, which is also a
common case.

3. We could still have domains.c responsible for most of the details;
the domain_check() API may be good enough as-is, though it seems to lack
any simple way to force re-lookup of the domain constraints once per xact.

Thoughts, better ideas?

Given the lack of field complaints, I don't feel a need to try to
back-patch a fix for this, but I do want to see it fixed for 9.5.

regards, tom lane

Responses

Re: plpgsql versus domains at 2015-02-26 18:01:57 from Pavel Stehule
Re: plpgsql versus domains at 2015-02-26 18:44:10 from Andres Freund

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2015-02-26 17:34:24	Re: Performance improvement for joins where outer side is unique
Previous Message	David Steele	2015-02-26 17:31:36	Re: pgaudit - an auditing extension for PostgreSQL