Re: WIP Patch: Precalculate stable functions, infrastructure v1

From: Marina Polyakova <m(dot)polyakova(at)postgrespro(dot)ru>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Arthur Zakirov <a(dot)zakirov(at)postgrespro(dot)ru>, Aleksander Alekseev <a(dot)alekseev(at)postgrespro(dot)ru>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: WIP Patch: Precalculate stable functions, infrastructure v1
Date: 2018-02-01 05:01:48
Message-ID: b9a484a81a59b724edb90c16288843cb@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello, hackers!

This is the 8-th version of the patch for the precalculation of stable
or immutable functions, stable or immutable operators and other
nonvolatile expressions. This is a try to fix the most problems (I'm
sorry, it took some time..) that Tom Lane and Andres Freund mentioned in
[1], [2] and [3]. It is based on the top of master, and on my computer
make check-world passes. And I'll continue work on it.

Any suggestions are welcome!

[1]
https://www.postgresql.org/message-id/27837.1516138246%40sss.pgh.pa.us
[2]
https://www.postgresql.org/message-id/29643.1516140301%40sss.pgh.pa.us
[3]
https://www.postgresql.org/message-id/20180124203616.3gx4vm45hpoijpw3%40alap3.anarazel.de

> On 17-01-2018 0:30, Tom Lane wrote:
> ...
> This is indeed quite a large patch, but it seems to me it could become
> smaller. After a bit of review:
>
> 1. I do not like what you've done with ParamListInfo. The changes
> around
> that are invasive, accounting for a noticeable part of the patch bulk,
> and I don't think they're well designed. Having to cast back and forth
> between ParamListInfo and ParamListInfoCommon and so on is ugly and
> prone
> to cause (or hide) errors. And I don't really understand why
> ParamListInfoPrecalculationData exists at all. Couldn't you have gotten
> the same result with far fewer notational changes by defining another
> PARAM_FLAG bit in the existing pflags field? (Or alternatively, maybe
> the real need here is for another ParamKind value for Param nodes?)
>
> I also dislike this approach because it effectively throws away the
> support for "virtual" param arrays that I added in commit 6719b238e:
> ParamListInfoPrecalculationData has no support for dynamically
> determined
> parameter properties, which is surely something that somebody will
> need.
> (It's just luck that the patch doesn't break plpgsql today.) I realize
> that that's a recent commit and the code I'm complaining about predates
> it, but we need to adjust this so that it fits in with the new
> approach.
> See comment block at lines 25ff in params.h.

Changed, now only the new flag PARAM_FLAG_PRECALCULATED

> 2. I don't follow the need for the also-rather-invasive changes to
> domain
> constraint data structures. I do see that the patch attempts to make
> CoerceToDomain nodes cacheable, which is flat wrong and has to be
> ripped
> out. You *cannot* assume that the planner has access to the same
> domain
> constraints that will apply at runtime.

Removed

> 4. I don't like the way that you've inserted
> "replace_qual_cached_expressions" and
> "replace_pathtarget_cached_expressions" calls into seemingly random
> places
> in the planner. Why isn't that being done uniformly during expression
> preprocessing? There's no apparent structure to where you've put these
> calls, and so they seem really vulnerable to errors of omission. Also,
> if this were done in expression preprocessing, there'd be a chance of
> combining it with some existing pass over expression trees instead of
> having to do a separate (and expensive) expression tree mutation.
> I can't help suspecting that eval_const_expressions could take this on
> as an additional responsibility with a lot less than a thousand new
> lines
> of code.

eval_const_expressions is changed accordingly and, thank you, now
there're fewer omissions)

> 5. BTW, cost_eval_cacheable_expr seems like useless restructuring as
> well.
> Why aren't you just recursively applying the regular costing function?

Fixed

> And what in the world is
> CheckBoundParams about? The internal documentation in this patch
> isn't quite nonexistent, but it's well short of being in a
> committable state IMO.

This is a try to improve it..

> 3. I think you should also try hard to get rid of the need for
> PlannedStmt.hasCachedExpr. AFAICS there's only one place that is
> using that flag, which is exec_simple_check_plan, and I have to
> think there are better ways we could deal with that. In particular,
> I don't understand why you haven't simply set up plpgsql parameter
> references to be noncacheable. Or maybe what we'd better do is
> disable CacheExpr insertions into potentially-simple plans in the
> first place. As you have it here, it's possible for recompilation
> of an expression to result in a change in whether it should be deemed
> simple or not, which will break things (cf commit 00418c612).

<...>

> Another thing that's bothering me is that the execution semantics
> you're proposing for CachedExpr seem rather inflexible. AFAICS, once a
> CachedExpr has run once, it will hang on to the result value and keep
> returning that for the entire lifespan of the compiled expression.
> We already noted that that breaks plpgsql's "simple expression"
> logic, and it seems inevitable to me that it will be an issue for
> other places as well. I think it'd be a better design if we had some
> provision for resetting the cached values, short of recompiling the
> expression from scratch.
>
> One way that occurs to me to do this is to replace the simple boolean
> isExecuted flags with a generation counter, and add a master generation
> counter to ExprState. The rule for executing CachedExpr would be "if
> my
> generation counter is different from the ExprState's counter, then
> evaluate the subexpression and copy the ExprState's counter into mine".
> Then the procedure for forcing recalculation of cached values is just
> to
> increment the ExprState's counter. There are other ways one could
> imagine
> doing this --- for instance, I initially thought of keeping the master
> counter in the ExprContext being used to run the expression. But you
> need
> some way to remember what counter value was used last with a particular
> expression, so probably keeping it in ExprState is better.

I did something like that..

>> Keeping the stored value of a CachedExpr in a Param slot is an
>> interesting idea indeed.
>
> We keep coming back to this, IIRC we had a pretty similar discussion
> around redesigning caseValue_datum/isNull domainValue_datum/isNull to
> be
> less ugly. There also was
> https://www.postgresql.org/message-id/20171116182208.kcvf75nfaldv36uh@alap3.anarazel.de
> where we discussed using something similar to PARAM_EXEC Param nodes to
> allow inlining of volatile functions.
>
> ISTM, there might be some value to consider all of them in the design
> of
> the new mechanism.

I'm sorry, the other parts have occupied all the time, and I'll work on
it..

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
v8-0001-Precalculate-stable-and-immutable-functions.patch text/x-diff 405.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2018-02-01 05:03:48 Re: [HACKERS] generated columns
Previous Message Amit Kapila 2018-02-01 04:48:00 Re: [HACKERS] why not parallel seq scan for slow functions