Re: extend pgbench expressions with functions

From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: extend pgbench expressions with functions
Date: 2015-11-04 09:06:30
Message-ID: alpine.DEB.2.10.1511040930040.16208@sto
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello Robert,

> 1. I think there should really be two patches here, the first adding
> functions, and the second adding doubles. Those seem like separate
> changes. And offhand, the double stuff looks a lot less useful that
> the function call syntax.

I first submitted the infrastructure part, but I was asked to show how
more functions could be included, especially random variants. As random
gaussian/exponential functions require a double argument, there must be
some support for double values.

Now, it could certainly be split in two patches, but this is rather
artificial, IMO.

> 2. ddebug and idebug seem like a lousy idea to me.

It was really useful to me for debugging and testing.

> If we really need to be able to print stuff from pgbench, which I kind
> of doubt, then maybe we should have a string data type and a print()
> function, or maybe a string data type and a toplevel \echo command.

The *debug functions allow to intercept the value computed within an
expression. If you rely on variables and some echo (which does not exist)
this means that there should be double variables as well, which is not
currently the case, and which I do not see as useful for the kind of
script written for pgbench. Adding the string type is more work, and
I do not see a good use case for those.

So the *debug functions are really just a lightweight solution for
debugging type related issues in expressions. I can drop them if this is a
blocker, but the are really useful for testing quickly a script.

> 3. I'm perplexed by why you've replaced evaluateExpr() with evalInt()
> and evalDouble().

As explained above in the thread (I think), the reason is that having one
overloaded expression evaluation which handles types conversion would
produce pretty heavy code, and the two functions with the descending
typing allows to have a much smaller code with the same effect.

The issue is that with two types all functions must handle argument type
conversion explicitely.

For instance for "random_gaussian(int, int, double)", it may be called
with any combination of 3 int/double arguments, each one must be tested
and possibly converted to the target type before calling the actual
function. For overloaded operators or functions (arithmetics, abs...)
there is also the decision about which operator is called and then what
conversions are necessary.

With the descending typing and two functions cross recursion all these
explicit tests and conversion disappear because the function evaluation
calls evalInt or evalDouble depending on the expected types.

Basically, the code is significantly shorter and elegant with this option.

> That doesn't seem similar to what I've seen in other
> expression-evaluation engines.

Probably. This is because I choose a descending typing to simplify the
implementation. Changing this would bring no real practical benefit from
the usage point of view, but would add significant more verbose and ugly
code to test and handle type conversions everywhere, so I'm not keen to do

> Perhaps I could find out by reading the comments, but actually not,
> because this entire patch seems to add only one comment:
> + /* reset column count for this scan */

There are a few others, really:-)

> While I'm not a fan of excessive commenting, I think a little more
> explanation here would be good.

I can certainly add more comments to the code, especially around the eval
cross recursion functions.

> 4. The table of functions in pgbench.sgml seems to leave something to
> be desired. We added a pretty detailed write-up on the Gaussian and
> exponential options to \setrandom, but exporand() has only this
> description:

Yep. The idea was *not* to replicate the (painful) explanations about
random functions, but that it should be shared between the function and
the \set variants.

> + <row>
> + <entry><literal><function>exporand(<replaceable>i</>,
> <replaceable>j</>, <replaceable>t</>)</></></>
> + <entry>integer</>
> + <entry>exponentially distributed random integer in the bounds,
> see below</>
> + <entry><literal>exporand(1, 10, 3.0)</></>
> + <entry>int between <literal>1</> and <literal>10</></>
> + </row>
> That's not very helpful.

The table explanation must be kept short for the table format...

> Without looking at the example, there's no way to guess what i and j
> mean, and even with looking at the example, there's no way to guess what
> t means. If, as I'm guessing, exporand() and guassrand() behave like
> \setrandom with the exponential and/or Gaussian options, then the
> documentation for one of those things should contain all of the detailed
> information and the documentation for the other should refer to it.

Indeed, that was the idea, but it seems that I forgot the pointer:-)

> More than likely, exporand() and gaussrand() should get the detailed
> explanation, and \setrandom should be document as a deprecated
> alternative to \set ... {gauss,expo,}rand(...)

Ok, the description can be moved to the function part and the \set
version reference the other.

> 5. I liked Heikki's proposed function names random_gaussian(min, max,
> threshold) and random_exponential(min, max, threshold) better than the
> ones you've picked here. I think random() would be OK instead of his
> suggestion of random_uniform(), though.


I'll submit an updated version of the patch, which addresses points 4 & 5
and documents 3, and wait for feedback on the explanations I gave before
doing anything for about 1 & 2, as I think that the implied changes are
not desirable. I'm not keen at all on changing the cross recursion
implementation (3), this would just be pretty ugly code without actual
user benefit.


In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2015-11-04 09:09:11 Re: extend pgbench expressions with functions
Previous Message Kouhei Kaigai 2015-11-04 08:28:11 Re: Foreign join pushdown vs EvalPlanQual