Re: [HACKERS] Doubt in pgbench TPS number

From: Andres Freund <andres(at)anarazel(dot)de>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: Tatsuo Ishii <ishii(at)postgresql(dot)org>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Doubt in pgbench TPS number
Date: 2020-02-27 20:26:36
Message-ID: 20200227202636.qaf7o6qcajsudoor@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2015-09-25 20:35:45 +0200, Fabien COELHO wrote:
>
> Hello Tatsuo,
>
> > Hmmm... I never use -C. The formula seems ok:
> >
> > tps_exclude = normal_xacts / (time_include -
> > (INSTR_TIME_GET_DOUBLE(conn_total_time) / nthreads));
>
> Hmmm... it is not:-)
>
> I think that the degree of parallelism to consider is nclients, not
> nthreads: while connection time is accumulated in conn_time, other clients
> are possibly doing their transactions, in parallel, even if it is in the
> same thread, so it is not "stopped time" for all clients. It starts to
> matter with "-j 1 -c 30" and slow transactions, the cumulated conn_time in
> each thread may be arbitrary close to the whole time if there are many
> clients.

I think this pretty much entirely broke the tps_exclude logic when not
using -C, especially when -c and -j differ. The wait time there is
actually per thread, not per client.

In this example I set post_auth_delay=1s on the server. Pgbench
tells me:
pgbench -M prepared -c 180 -j 180 -T 10 -P1 -S
tps = 897607.544862 (including connections establishing)
tps = 1004793.708611 (excluding connections establishing)

pgbench -M prepared -c 180 -j 60 -T 10 -P1 -S
tps = 739502.979613 (including connections establishing)
tps = 822639.038779 (excluding connections establishing)

pgbench -M prepared -c 180 -j 30 -T 10 -P1 -S
tps = 376468.177081 (including connections establishing)
tps = 418554.527585 (excluding connections establishing)

which pretty obviously is bogus. While I'd not expect it'd to work
perfectly, the "excluding" number should stay roughly constant.

The fundamental issue is that without -C *none* of the connections in
each thread gets to actually execute work before all of them have
established a connection. So dividing conn_total_time by / nclients is
wrong.

For more realistic connection delays this leads to the 'excluding'
number being way too close to the 'including' number, even if a
substantial portion of the time is spent on it.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Janes 2020-02-27 20:27:21 ALTER TEXT SEARCH DICTIONARY tab completion
Previous Message Tom Lane 2020-02-27 19:51:14 Less-silly selectivity for JSONB matching operators