Re: Doubt in pgbench TPS number

From: Tatsuo Ishii <ishii(at)postgresql(dot)org>
To: coelho(at)cri(dot)ensmp(dot)fr
Cc: tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Doubt in pgbench TPS number
Date: 2015-09-30 04:26:03
Message-ID: 20150930.132603.688248443160995861.t-ishii@sraoss.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>> Here conn_total_time is the sum of time to establish connection to
>> PostgreSQL. Since establishing connections to PostgreSQL is done in
>> serial rather than in parallel, conn_total_time should have been
>> divided by nclients.
>
> After some more thinking and looking again at the connection code, I
> revise slightly my diagnostic:
>
> - the amount of parallelism is "nclients", as discussed above, when
> - reconnecting on each transaction (-C) because the connections are
> - managed in parallel from doCustom,
>
> * BUT *
>
> - if there is no reconnections (not -C) the connections are performed in
> - threadRun in a sequential way, all clients wait for the connections of
> - other clients in the same thread before starting processing
> - transactions, so "nthreads" is the right amount of parallelism in this
> - case.
>
> So on second thought the formula should rather be:
>
> ... / (is_connect? nclients: nthreads)

I don't think this is quite correct.

If is_connect is false, then following loop is executed in threadRun():

/* make connections to the database */
for (i = 0; i < nstate; i++)
{
if ((state[i].con = doConnect()) == NULL)
goto done;
}

Here, nstate is nclients/nthreads. Suppose nclients = 16 and nthreads
= 2, then 2 threads run in parallel, and each thread is connecting 8
times (nstate = 8) in *serial*. The total connection time for this
thread is calculated by "the time ends the loop" - "the time starts
the loop". So if the time to establish a connection is 1 second, the
total connection time for a thread will be 8 seconds. Thus grand total
of connection time will be 2 * 8 = 16 seconds.

If is_connect is true, following loop is executed.

/* send start up queries in async manner */
for (i = 0; i < nstate; i++)
{
CState *st = &state[i];
Command **commands = sql_files[st->use_file];
int prev_ecnt = st->ecnt;

st->use_file = getrand(thread, 0, num_files - 1);
if (!doCustom(thread, st, &thread->conn_time, logfile, &aggs))

In the loop, exactly same thing happens as is_connect = false case. If
t = 1, total connection time will be same as is_connect = false case,
i.e. 16 seconds.

In summary, I see no reason to change the v1 patch.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2015-09-30 04:49:00 Re: pageinspect patch, for showing tuple data
Previous Message Adam Brightwell 2015-09-30 04:14:10 Re: Arguable RLS security bug, EvalPlanQual() paranoia