|From:||Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>|
|Subject:||plpgsql function startup-time improvements|
|Views:||Raw Message | Whole Thread | Download mbox | Resend email|
Attached are patches for two performance-improvement ideas that came
to me while working on
The three patches are logically independent and could be committed in
any order. But they touch some overlapping code, so as presented,
you need to apply that other patch first and then these two in
The motivation for the first patch is that I noticed that for simple
plpgsql functions, especially triggers, the per-datum palloc()s performed
by copy_plpgsql_datum() during function entry amounted to a significant
fraction of the total runtime. To fix that, the patch simply does one
palloc for the space needed by all datums, relying on a space calculation
performed at the end of function compilation by plpgsql_finish_datums().
This does nothing much for trivial functions with only a few datums, but
for ones with more, it's a worthwhile savings.
BTW, I experimented with a more drastic solution involving separating
the "read only" and "writable" parts of PLpgSQL_datum structs and then
instantiating only the "writable" parts, thus considerably reducing the
amount of data to be copied during per-call initialization. But that
was a lot more invasive to the code, and it seemed to be slower than
what I present here, because performance-critical accesses to variables
had to compute the addresses of both structs associated with the variable.
I've not totally given up hope for that idea, but it'll require more
tuning than I had time for.
In addition to the core idea of the patch, I noticed that there is no
good reason for PLpgSQL_expr to be treated as a kind of PLpgSQL_datum;
those structs are never members of the estate->datums array, nor is
there any code expecting them to be structural supersets of PLpgSQL_datum.
So the patch also removes PLPGSQL_DTYPE_EXPR and the useless fields of
Also, I changed PLpgSQL_var.isconst and PLpgSQL_var.notnull from "int"
to "bool", which is what they should have been all along, and relocated
them in the PLpgSQL_var struct. There are two motivations for this.
It saves a whole 8 bytes per PLpgSQL_var, at least on 64-bit machines,
because the fields now fit into what had been wasted padding space;
reducing the size of what we have to copy during copy_plpgsql_datums
has to be worth something. Second, those fields are now adjacent to
the common PLpgSQL_variable fields, which will simplify migrating them
into PLpgSQL_variable, as I anticipate we'll want to do at some point
when we allow composite variables to be marked CONSTANT and maybe NOT
The idea of the second patch came from noticing that in simple trigger
functions, quite a large fraction of cycles went into setting up the
"built in" variables such as tg_name, even though many trigger functions
probably never read most of those variables. We could improve that by
not computing the values until/unless they're read. There are various
names for this technique, but the one that seemed most evocative to me
was to say that these variables have "promises" attached to them. So
that's what the patch calls them. We mark the variables with an enum
indicating which promise needs to be fulfilled for each one, and then
when about to read a datum, we fulfill the promise if needed.
The method I settled on for that was to invent a separate DTYPE_PROMISE,
which otherwise is treated exactly like DTYPE_VAR, and to code places
like exec_eval_datum() like this:
+ case PLPGSQL_DTYPE_PROMISE:
+ /* fulfill promise if needed, then handle like regular var */
+ plpgsql_fulfill_promise(estate, (PLpgSQL_var *) datum);
+ /* FALL THRU */
PLpgSQL_var *var = (PLpgSQL_var *) datum;
The extra DTYPE is a little bit grotty, but it's not awful. One
alternative I experimented with was to just treat these variables
as plain DTYPE_VAR, requiring coding like
PLpgSQL_var *var = (PLpgSQL_var *) datum;
+ if (unlikely(var->promise != PLPGSQL_PROMISE_NONE))
+ plpgsql_fulfill_promise(estate, var);
*typeid = var->datatype->typoid;
*typetypmod = var->datatype->atttypmod;
*value = var->value;
However, this way is injecting an additional test-and-branch into
hot code paths, and it was demonstrably slower.
With these patches, I see performance improvements of 10% to 20%
on simple but not totally unrealistic triggers, for example
create or replace function mytrig() returns trigger language plpgsql as
if (new.f1 != new.f2) or (new.f3 != new.f4) then
new.f3 = 42;
(BTW, those are percentages of total INSERT runtime, not just of
the trigger proper; though I cheated to the extent of using a
temp not regular table.)
It seems possible that the "promise" technique could be useful for
other plpgsql special variables in future. I thought briefly about
applying it to triggers' NEW and OLD arguments, but desisted because
(a) it's only a win if triggers will commonly not touch the variable,
which seems unlikely to be true for NEW/OLD; and (b) it would have
required infrastructure for attaching a promise to a DTYPE_REC
variable, which was more pain than I wanted. But I wonder if it'd
be useful for, say, the special variables that exception blocks create.
I'll add this to the January commitfest.
regards, tom lane
|Next Message||Alvaro Herrera||2017-12-27 21:31:12||Re: pgsql: Get rid of copy_partition_key|
|Previous Message||Antonio Belloni||2017-12-27 20:18:27||Contributing with code|