Re: [WIP] Zipfian distribution in pgbench

From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: Alik Khilazhev <a(dot)khilazhev(at)postgrespro(dot)ru>
Cc: PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [WIP] Zipfian distribution in pgbench
Date: 2017-07-13 16:14:23
Message-ID: alpine.DEB.2.20.1707131810450.20175@lancre
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Hello Alik,

A few comments about the patch v2.

Patch applies and compiles.

Documentation says that the closer theta is from 0 the flatter the distribution
but the implementation requires at least 1, including strange error messages:

zipfian parameter must be greater than 1.000000 (not 1.000000)

Could theta be allowed between 0 & 1 ? I've tried forcing with theta = 0.1
and it worked well, so I'm not sure that I understand the restriction.
I also tried with theta=0.001 but it seemed less good.

I have also tried to check the distribution wrt the explanations, with the
attached scripts, n=100, theta=1.000001/1.5/3.0: It does not seem to work,
there is repeatable +15% bias on i=3 and repeatable -3% to -30% bias for
values in i=10-100, this for different values of theta (1.000001,1.5,
3.0).

If you try the script, beware to set parameters (theta, n) consistently.

About the code:

Remove spaces and tabs at end of lines or on empty lines.

zipfn: I would suggest to move the pg_erand48 out and pass the result
instead of passing the thread. the erand call would move to getZipfianRand.

I'm wondering whether the "nearest" hack could be removed so as to simplify
the cache functions code...

Avoid editing lines without changes (spacesn/tabs?)
- thread->logfile = NULL; /* filled in later */
+ thread->logfile = NULL; /* filled in later */

The documentation explaining the intuitive distribution talks about a N
but does not provide any hint about its value depending on the parameters.

There is an useless empty lien before "</para>" after that.

--
Fabien.

Attachment Content-Type Size
compte_init.sql application/x-sql 187 bytes
compte_bench.sql application/x-sql 180 bytes
compte_expected.sql application/x-sql 550 bytes
compte_results.sql application/x-sql 73 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2017-07-13 16:32:53 Re: WIP Patch: Pgbench Serialization and deadlock errors
Previous Message Andrew Dunstan 2017-07-13 16:14:19 Re: pl/perl extension fails on Windows