From: | Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> |
---|---|

To: | Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> |

Cc: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Hironobu SUZUKI <hironobu(at)interdb(dot)jp>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |

Subject: | Re: pgbench - add pseudo-random permutation function |

Date: | 2021-04-05 12:07:25 |

Message-ID: | alpine.DEB.2.22.394.2104051009040.2033293@pseudo |

Views: | Raw Message | Whole Thread | Download mbox | Resend email |

Thread: | |

Lists: | pgsql-hackers |

Hello Dean,

>> I think that permute should only use integer operations. I'd suggest to

>> use one of the integer variants instead of going through a double

>> computation and casting back to int. The internal state is based on

>> integers, I do not see the added value of going through floats, possibly

>> enduring floating point issues (undeflow, rounding, normalization,

>> whatever) on the way, whereas from start to finish we just need ints.

>

> This is the already-established coding pattern used in getrand() to

> pick a random number uniformly in some range that's not necessarily a

> power of 2.

Indeed. I'm not particularly happy with that one either.

> Floating point underflow and normalisation issues are not possible

> because erand48() takes a 48-bit integer N and uses ldexp() to store

> N/2^48 in a double, which is an exact operation since IEEE doubles

> have something like 56-bit mantissas.

Double mantissa size is 52 bits.

> This is then turned back into an integer in the required range by

> multiplying by the desired maximum value, so there's never any risk of

> underflow or normalisation issues.

ISTM that there are significant issues when multiplying with an integer,

because the integer is cast to a double before multiplying, so if the int

is over 52 bits then it is coldly truncated and some values are just lost

in the process and will never be drawn. Probably not too many of them, but

some of them anyway.

> I guess that there may be rounding variations once the required

> maximum value exceeds something like 2^56 (although the comment in

> getrand() is much more conservative than that), so it's possible that

> a pgbench script calling random() with (ub-lb) larger than that might

> give different results on different platforms.

Dunno. This may be the same issue I'm pointing out above.

> For the non-uniform random functions, that effect might well kick in

> sooner. I'm not aware of any field complaints about that though,

> possibly because real-world data sizes are significantly smaller than

> that.

>

> In practice, permute() is likely to take its input from one of the

> non-uniform random functions, so it won't be permute() that first

> introduces rounding issues.

Sure. I'd like permute to be immune to that.

>> See attached v27 proposal.

>

> This update has a number of flaws. For example, this:

Indeed:-)

> +static uint64

> +randu64(RandomState * state)

> +{

> + uint64 r1 = pg_jrand48((*state).xseed),

> + r2 = pg_jrand48((*state).xseed);

> + return r1 << 51 | r2 << 13 | r1 >> 13;

> +}

>

> It still uses a 48-bit RandomState, so it doesn't improve on getrand()

> in that respect.

Sure. I'm pretty unhappy with that one, but I was not trying to address

that. My idea that randu64 would be replace with something better at some

point. My intention was "64-bits pseudo-random", my implementation does

not work, ok:-)

> It replaces a single erand48() call with 2 jrand48() calls, which

> comes at a cost in terms of performance. (In fact, changing the number

> of rounds in the previous version of permute() from 4 to 6 has a

> smaller performance impact than this -- more about that below.)

Sure, same remark as above, I was not trying to address that pointB.

> jrand48() returns a signed 32-bit integer, which has a 50% chance of

> being negative, so when that is cast to a uint64, there is a 50%

> chance that the 32 most significant bits will be 1.

Argh.

> When the various parts are OR'ed together, that will then mask out any

> randomness from the other parts. For example, 50% of the time, the

> jrand48() value used for r1 will be negative, and so 32 bits in the

> middle of the final result will all be set.

Argh. I hesitated to use xor. I should not have:-)

> So overall, the results will be highly non-uniform, with less

> randomness and poorer performance than erand48().

Indeed, bad choice. I wanted to used the unsigned version but it is not

implemented, and swichted to the signed version without thinking of some

of the implications.

> In addition, it returns a result in the range [0,2^64), which is not

> really what's wanted. For example:

>

> + /* Random offset */

> + r = randu64(&random_state2);

> + v = (v + r) % size;

>

> The previous code used r = getrand(0, size-1), which gave a uniformly

> random offset. However, the new code risks introducing additional

> non-uniformity when size is not a power of 2.

ISTM that the overall non uniformity is worse with the float approach as

opposed to the int approach.

Conceptually, the same kind of bias is expected whether you get through

floats or through ints, because the underlying power-of-two maths is the

same, so what makes the difference in reducing non-uniformity is using

more bits. Basically, when enough bits are used the same number of values

should appear n vs n+1 times.

When not enough bits are provided, things get ugly: for instance, with

size = 2^53, even if the floats were fully the 52-bit float pseudo-random

mantissa (they are really 48 with erand48) would result in only even

numbers to be produced, whereas with ints all numbers are produced. With

erand48, when size is above 48 bits ISTM that last bits are always zeros

with the double approach. I'm not counting lost values because of size

truncation when converting it to double.

> Finally, worst of all, this random offset is no longer bijective, due

> to 64-bit integer wrap-around. For example, suppose that size=100 and

> r=(2^64-10), then the following 2 values both map to the same result:

>

> v = 20 -> (v + r) % size

> = (20 + (2^64 - 10)) % 100

> = (2^64 + 10) % 100

> = (10) % 100

> = 10

>

> v = 4 -> (v + r) % size

> = (4 + (2^64 - 10)) % 100

> = (2^64 - 6) % 100

> = (18446744073709551610) % 100

> = 10

>

> So not only are the results no longer uniformly random, they might not

> even form a permutation.

Indeed, this one is pretty fun! Probably the right formula for this

approach is "(v + r % size) % size", which is kind of a mouthful.

I fully agree that my v27 implementation is butched on many dimensions,

some of them intentional and temporary (use jrand48 twice) and some of

them accidental (not considering int overflows, being optimistic on signed

to unsigned casts…).

I still disagree though that going through floating point is the right

thing to do, because of some of the issues I outlined above (eg truncation

and rounding for above 48/52-bits sizes). Basically I think that an

algorithm dealing with integers should not have to resort to floating

point computations unless it is actually required. This is not the case

for permute, were v26 is using doubles as glorified 48-bit integers, that

could be extended to 52-bit integers, but no more. The only benefit I see

is using implicitly the internal 104-bit rounding by truncation on

multiply, but I do not think that implicitely reducing the practical int

values to 52 bits is worth it, and that the same quality (bias) can be

achieved for 63 bits integers by keeping them as ints are writing the

right formula, which I fully failed to demonstrate in v27.

> I did some more testing of the previous version (v26), this time

> looking at much larger sizes, all the way up to the maximum, which is

> 2^63-1 since it comes from a signed int64. In general, the results

> were very good, though I did notice some slight non-uniformity in the

> way adjacent inputs were separated from another when the size was just

> under a power of 2. I think that's the hardest case for this

> algorithm, because there's very little overlap between the 2 halves.

Yes, less values are steered twice per round. However, as for adjacent

values for large sizes, I'm wondering whether this may have more to do

with the 48 bit limitations, so that lower bits are not really xored for

instance. Not sure.

> Increasing the number of rounds from 4 to 6 ironed out that

> non-uniformity (and as mentioned above, is still cheaper than using

> randu64() with 4 rounds), so I think we should go with that.

There is a quality-cost tradeoff. With the previous version I convinced

myself that 4 rounds were a good compromise (not perfect, but ok for

keeping the cost low on practical sizes).

With this version, I'll admit that I do not have an opinion.

> You may wish to submit a separate patch to replace pgbench's use of

> *rand48() with something else, and that would be discussed on its own

> merits, but I don't see why that should hold up adding permute().

I'll see.

Attached a v28 which I hope fixes the many above issues and stays with

ints. The randu64 is still some kind of a joke, I artificially reduced the

cost by calling jrand48 once and extending it to 64 bits, so it could give

an idea of the cost endured if a 64-bit prng was used.

Now you are the committer, you can do as you please, I'm just stating my

(mathematical) opinions about using floating point computations for that.

I think that apart from this point of principle/philosophy the permute

performance and implementation are reasonable, and better than my initial

version because it avoids int128 computations and the large prime number

business.

--

Fabien.

Attachment | Content-Type | Size |
---|---|---|

pgbench-prp-func-28.patch | text/x-diff | 16.2 KB |

- Re: pgbench - add pseudo-random permutation function at 2021-04-04 18:21:49 from Dean Rasheed

- Re: pgbench - add pseudo-random permutation function at 2021-04-06 11:19:26 from Dean Rasheed

From | Date | Subject | |
---|---|---|---|

Next Message | Sait Talha Nisanci | 2021-04-05 12:07:29 | Re: [EXTERNAL] Re: Crash in record_type_typmod_compare |

Previous Message | torikoshia | 2021-04-05 12:03:12 | Re: Get memory contexts of an arbitrary backend process |