Re: pgbench - add pseudo-random permutation function

From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Hironobu SUZUKI <hironobu(at)interdb(dot)jp>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, David Steele <david(at)pgmasters(dot)net>
Subject: Re: pgbench - add pseudo-random permutation function
Date: 2020-02-01 10:12:25
Message-ID: alpine.DEB.2.21.2002011007340.20752@pseudo
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Hello Alvaro,

>> I read the whole thread, I still don't know what this patch is supposed to
>> do. I know what the words in the subject line mean, but I don't know how
>> this helps a pgbench user run better benchmarks. I feel this is also the
>> sentiment expressed by others earlier in the thread. You indicated that
>> this functionality makes sense to those who want this functionality, but so
>> far only two people, namely the patch author and the reviewer, have
>> participated in the discussion on the substance of this patch. So either
>> the feature is extremely niche, or nobody understands it. I think you ought
>> to take about three steps back and explain this in more basic terms, even
>> just in email at first so that we can then discuss what to put into the
>> documentation.
>
> After re-reading one more time, it dawned on me that the point of this
> is similar in spirit to this one:
> https://wiki.postgresql.org/wiki/Pseudo_encrypt

Indeed. The one in the wiki is useless because it is on all integers,
whereas in a benchmark you want it for a given size and you want seeding,
but otherwise the same correlation-avoidance problem is addressed.

> The idea seems to be to map the int4 domain into itself, so you can use
> a sequence to generate numbers that will not look like a sequence,
> allowing the user to hide some properties (such as the generation rate)
> that might be useful to an eavesdropper/attacker. In terms of writing
> benchmarks, it seems useful to destroy all locality of access, which
> changes the benchmark completely.

Yes.

> (I'm not sure if this is something benchmark writers really want to
> have.)

I do not get this sentence. I'm sure that a benchmark writer should really
want to avoid unrealistic correlations that have a performance impact.

> If I'm right, then I agree that the documentation provided with the
> patch does a pretty bad job at explaining it, because until now I didn't
> at all realize this is what it was.

The documentation is improvable, no doubt.

Attached is an attempt at improving things. I have added a explicit note
and hijacked an existing example to better illustrate the purpose of the
function.

--
Fabien.

Attachment Content-Type Size
pgbench-prp-func-18.patch text/x-diff 20.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2020-02-01 11:13:10 Re: fix for BUG #3720: wrong results at using ltree
Previous Message Dent John 2020-02-01 09:55:28 Re: polymorphic table functions light