Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: Marina Polyakova <m(dot)polyakova(at)postgrespro(dot)ru>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org, Ildus Kurbangaliev <i(dot)kurbangaliev(at)postgrespro(dot)ru>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
Date: 2018-06-09 06:55:30
Message-ID: alpine.DEB.2.21.1806090810090.5307@lancre
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Hello Marina,

> v9-0001-Pgbench-errors-use-the-RandomState-structure-for-.patch
> - a patch for the RandomState structure (this is used to reset a client's
> random seed during the repeating of transactions after serialization/deadlock
> failures).

A few comments about this first patch.

Patch applies cleanly, compiles, global & pgbench "make check" ok.

I'm mostly ok with the changes, which cleanly separate the different use
of random between threads (script choice, throttle delay, sampling...) and
client (random*() calls).

This change is necessary so that a client can restart a transaction
deterministically (at the client level at least), which is the ultimate
aim of the patch series.

A few remarks:

The RandomState struct is 6 bytes, which will induce some padding when
used. This is life and pre-existing. No problem.

ISTM that the struct itself does not need a name, ie. "typedef struct {
... } RandomState" is enough.

There could be clear comments, say in the TState and CState structs, about
what randomness is impacted (i.e. script choices, etc.).

getZipfianRand, computeHarmonicZipfian: The "thread" parameter was
justified because it was used for two fieds. As the random state is
separated, I'd suggest that the other argument should be a zipfcache
pointer.

While reading your patch, it occurs to me that a run is not deterministic
at the thread level under throttling and sampling, because the random
state is sollicited differently depending on when transaction ends. This
suggest that maybe each thread random_state use should have its own random
state.

In passing, and totally unrelated to this patch:

I've always been a little puzzled about why a quite small 48-bit internal
state random generator is used. I understand the need for pg to have a
portable & state-controlled thread-safe random generator, but why this
particular small one fails me. The source code (src/port/erand48.c,
copyright in 1993...) looks optimized for 16 bits architectures, which is
probably pretty inefficent to run on 64 bits architectures. Maybe this
could be updated with something more consistent with today's processors,
providing more quality at a lower cost.

--
Fabien.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Gierth 2018-06-09 07:27:43 Re: Compromised postgresql instances
Previous Message Thomas Kellerer 2018-06-09 05:27:35 Re: Compromised postgresql instances