Re: gaussian distribution pgbench -- splits v4

From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Mitsumasa KONDO <kondo(dot)mitsumasa(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: gaussian distribution pgbench -- splits v4
Date: 2014-07-23 16:39:34
Message-ID: alpine.DEB.2.10.1407231828340.3626@sto
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Hello Robert,

> Some review comments:

Thanks a lot for your return.

Please find attached two new parts of the patch (A for setrandom
extension, B for pgbench embedded test case extension).

> 1. I suggest that getExponentialrand and getGaussianrand be renamed to
> getExponentialRand and getGaussianRand.

Done.

It was named like that because "getrand" was used for the uniform case.

> 2. I suggest that the code be changed so that the branch currently
> labeled as /* uniform with extra argument */ become a hard error
> instead of a warning.
>
> 3. Similarly, I suggest that the use of gaussian or uniform be an
> error when argc < 6 OR argc > 6. I also suggest that the
> parenthesized distribution type be dropped from the error message in
> all cases.

I wish to agree, but my interpretation of the previous code is that they
were ignored before, so ISTM that we are stuck with keeping the same
unfortunate behavior.

> 4. This question mark seems like it should be a period:
>
> + * value fails the test? To be on the safe side, let us try over.

Indeed.

> 5. With regards to the following paragraph:
>
> <para>
> + The default random distribution is uniform, that is all values in the
> + range are drawn with equal probability. The gaussian and exponential
> + options allow to change this default. The mandatory
> + <replaceable>threshold</> double value controls the actual distribution
> + with gaussian or exponential.
> + </para>
>
> This paragraph needs a bit of copy-editing. Here's an attempt: "By
> default, all values in the range are drawn with equal probability.
> The <literal>gaussian</> and <literal>exponential</> options modify
> this behavior; each requires a mandatory threshold which determines
> the precise shape of the distribution." The following paragraph
> should be changed to begin with "For a Gaussian distribution" and the
> one after "For an exponential distribution".

Ok. I've kept "uniform" in the first sentence, because this is both
an option name and it is significant in term of probabilities.

> 6. Overall, I think the documentation here looks much better now, but
> I suggest adding one or two example to the Gaussian section. Like
> this: for example, if threshold is 2.0, 68% of the values will fall in
> the middle third of the interval; with a threshold of 3.0, 99.7% of
> the values will fall in the middle third of the interval. These
> numbers are fabricated, and the middle third of the interval might not
> be the best part to talk about, but you get the idea (I hope).

Done with threshold value 4.0 so I have a "middle quarter" and a "middle
half".

--
Fabien.

Attachment Content-Type Size
gauss_A_4.patch text/x-diff 16.0 KB
gauss_B_4.patch text/x-diff 11.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2014-07-23 16:58:22 Re: Making joins involving ctid work for the benefit of UPSERT
Previous Message MauMau 2014-07-23 16:17:54 Re: [RFC] Should smgrtruncate() avoid sending sinval message for temp relations