Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

From: Marina Polyakova <m(dot)polyakova(at)postgrespro(dot)ru>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org, Ildus Kurbangaliev <i(dot)kurbangaliev(at)postgrespro(dot)ru>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
Date: 2018-06-09 21:27:02
Message-ID: 66cba8455d0c1d5b1bbd4f15f379bbd8@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 09-06-2018 9:55, Fabien COELHO wrote:
> Hello Marina,

Hello!

>> v9-0001-Pgbench-errors-use-the-RandomState-structure-for-.patch
>> - a patch for the RandomState structure (this is used to reset a
>> client's random seed during the repeating of transactions after
>> serialization/deadlock failures).
>
> A few comments about this first patch.

Thank you very much!

> Patch applies cleanly, compiles, global & pgbench "make check" ok.
>
> I'm mostly ok with the changes, which cleanly separate the different
> use of random between threads (script choice, throttle delay,
> sampling...) and client (random*() calls).

Glad to hear it :)

> This change is necessary so that a client can restart a transaction
> deterministically (at the client level at least), which is the
> ultimate aim of the patch series.
>
> A few remarks:
>
> The RandomState struct is 6 bytes, which will induce some padding when
> used. This is life and pre-existing. No problem.
>
> ISTM that the struct itself does not need a name, ie. "typedef struct
> { ... } RandomState" is enough.

Ok!

> There could be clear comments, say in the TState and CState structs,
> about what randomness is impacted (i.e. script choices, etc.).

Thank you, I'll add them.

> getZipfianRand, computeHarmonicZipfian: The "thread" parameter was
> justified because it was used for two fieds. As the random state is
> separated, I'd suggest that the other argument should be a zipfcache
> pointer.

I agree with you and I will change it.

> While reading your patch, it occurs to me that a run is not
> deterministic at the thread level under throttling and sampling,
> because the random state is sollicited differently depending on when
> transaction ends. This suggest that maybe each thread random_state use
> should have its own random state.

Thank you, I'll fix this.

> In passing, and totally unrelated to this patch:
>
> I've always been a little puzzled about why a quite small 48-bit
> internal state random generator is used. I understand the need for pg
> to have a portable & state-controlled thread-safe random generator,
> but why this particular small one fails me. The source code
> (src/port/erand48.c, copyright in 1993...) looks optimized for 16 bits
> architectures, which is probably pretty inefficent to run on 64 bits
> architectures. Maybe this could be updated with something more
> consistent with today's processors, providing more quality at a lower
> cost.

This sounds interesting, thanks!
*went to look for a multiplier and a summand that are large enough and
are mutually prime..*

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Marina Polyakova 2018-06-09 21:38:55 Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
Previous Message Tom Lane 2018-06-09 21:00:26 Re: why partition pruning doesn't work?