From: | Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> |
---|---|

To: | Andres Freund <andres(at)anarazel(dot)de> |

Cc: | Hironobu SUZUKI <hironobu(at)interdb(dot)jp>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |

Subject: | Re: pgbench - add pseudo-random permutation function |

Date: | 2019-02-14 21:17:53 |

Message-ID: | alpine.DEB.2.21.1902142155050.20189@lancre |

Views: | Raw Message | Whole Thread | Download mbox | Resend email |

Thread: | |

Lists: | pgsql-hackers |

Hello Andres,

>> +# PGAC_C_BUILTIN_CLZLL

>

> I think this has been partially superceded by

>

> commit 711bab1e4d19b5c9967328315a542d93386b1ac5

> Author: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>

> Date: 2019-02-13 16:10:06 -0300

Indeed, the patch needs a rebase & conflit resolution. I'll do it. Later.

>> <para>

>> + Function <literal>pr_perm</literal> implements a pseudo-random permutation.

>> + It allows to mix the output of non uniform random functions so that

>> + values drawn more often are not trivially correlated.

>> + It permutes integers in [0, size) using a seed by applying rounds of

>> + simple invertible functions, similarly to an encryption function,

>> + although beware that it is not at all cryptographically secure.

>> + Compared to <literal>hash</literal> functions discussed above, the function

>> + ensures that a perfect permutation is applied: there are no collisions

>> + nor holes in the output values.

>> + Values outside the interval are interpreted modulo the size.

>> + The function errors if size is not positive.

>> + If no seed is provided, <literal>:default_seed</literal> is used.

>> + For a given size and seed, the function is fully deterministic: if two

>> + permutations on the same size must not be correlated, use distinct seeds

>> + as outlined in the previous example about hash functions.

>> + </para>

>

> This doesn't really explain why we want this in pgbench.

Who is "we"?

If someone runs non uniform tests, ie with random_exp/zipf/gauss, close

values are drawn with a similar frequency, thus correlated, inducing an

undeserved correlation at the page level (eg for read) and better

performance that would be the case if relative frequencies were not

correlated to key values.

So the function allows having more realistic non uniform test, whereas

currently we can only have non uniform test with very unrealistically

correlated values at the key level and possibly at the page level, meaning

non representative performances because of these induced bias.

This is under the assumption that pgbench should allow more realistic

performance test scenarii, which I believe is a desirable purpose. If

someone disagree with this purpose, then they would consider both non

uniform random functions and this proposed pseudo-random permutation

function as useless, as probably most other additions to pgbench.

--

Fabien.

- Re: pgbench - add pseudo-random permutation function at 2019-02-14 17:59:58 from Andres Freund

- Re: pgbench - add pseudo-random permutation function at 2019-03-03 08:55:58 from Fabien COELHO

From | Date | Subject | |
---|---|---|---|

Next Message | legrand legrand | 2019-02-14 21:21:39 | Re: Planning counters in pg_stat_statements (using pgss_store) |

Previous Message | Jacob Champion | 2019-02-14 21:01:55 | Re: libpq debug log |