| From: | Tatsuo Ishii <ishii(at)postgresql(dot)org> | 
|---|---|
| To: | coelho(at)cri(dot)ensmp(dot)fr | 
| Cc: | pgsql-hackers(at)postgresql(dot)org | 
| Subject: | Re: Doubt in pgbench TPS number | 
| Date: | 2015-09-28 08:38:19 | 
| Message-ID: | 20150928.173819.151414274788744526.t-ishii@sraoss.co.jp | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
>> I'm not sure about this. I think pgbench will not start transactions
>> until all clients establish connections to PostgreSQL.
> 
> I think that is true if "!is_connect", all client connections are
> performed at the beginning of threadRun, but under -C each client has
> its own connect/deconnect integrated within doCustom, so it is done in
> parallel to other clients having their transactions processed, hence
> the issue with the formula.
Really?
I have tested with pgpool-II which is set to accept up to 2
connections, then run pgbench with -C and -c 32. pgbench was blocked
as expected and I attached gdb and got stack trace:
(gdb) bt
#0  0x00007f48d5f17110 in __poll_nocancel ()
    at ../sysdeps/unix/syscall-template.S:81
#1  0x00007f48d6724056 in pqSocketCheck ()
   from /usr/local/src/pgsql/current/lib/libpq.so.5
#2  0x00007f48d6724940 in pqWaitTimed ()
   from /usr/local/src/pgsql/current/lib/libpq.so.5
#3  0x00007f48d671f3e2 in connectDBComplete ()
   from /usr/local/src/pgsql/current/lib/libpq.so.5
#4  0x00007f48d671fbcf in PQconnectdbParams ()
   from /usr/local/src/pgsql/current/lib/libpq.so.5
#5  0x0000000000402b2b in doConnect () at pgbench.c:650
#6  0x0000000000404591 in doCustom (thread=0x25c2f40, st=0x25c2770, 
    conn_time=0x25c2f90, logfile=0x0, agg=0x7fffba224260) at pgbench.c:1353
#7  0x000000000040a1d5 in threadRun (arg=0x25c2f40) at pgbench.c:3581
#8  0x0000000000409ab4 in main (argc=12, argv=0x7fffba224668) at pgbench.c:3455
As you can see, one of threads wants to connect to PostgreSQL
(actually pgpool-II) and waits for reply.
In threadRun() around line 3581:
	/* send start up queries in async manner */
	for (i = 0; i < nstate; i++)
	{
		CState	   *st = &state[i];
		Command   **commands = sql_files[st->use_file];
		int			prev_ecnt = st->ecnt;
		st->use_file = getrand(thread, 0, num_files - 1);
		if (!doCustom(thread, st, &thread->conn_time, logfile, &aggs))
			remains--;			/* I've aborted */
		if (st->ecnt > prev_ecnt && commands[st->state]->type == META_COMMAND)
		{
			fprintf(stderr, "client %d aborted in state %d; execution of meta-command failed\n",
					i, st->state);
			remains--;			/* I've aborted */
			PQfinish(st->con);
			st->con = NULL;
		}
	}
Here doCustome() is called with st->con == NULL. In doCustom() around
line 1353:
	if (st->con == NULL)
	{
		instr_time	start,
					end;
		INSTR_TIME_SET_CURRENT(start);
		if ((st->con = doConnect()) == NULL)
		{
doConnect() blocks until PostgreSQL (pgpool-II) allows to be
connected.
Because outer loop in threadRun() wants to loop over until all threads
succefully connects to PostgreSQL, pgbench is blocked here.
	/* send start up queries in async manner */
	for (i = 0; i < nstate; i++)
>> I'm going to commit your patch if there's no objection.
> 
> This seems fine with me.
> 
> The formula change, and just this one, should probably be backported
> somehow, as this is a bug, wherever pgbench resides in older
> versions. It is just 's/nthreads/nclients/' in the printResult formula
> for computing tps_exclude.
Yeah, that's definitely a bug but I'm afraid the fix will change the
TPS number and may break the backward compatibility. Since we have
lived with bug for years, I hesitate to back port to older stable
branches...
Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Fabien COELHO | 2015-09-28 09:02:41 | Re: Doubt in pgbench TPS number | 
| Previous Message | David Rowley | 2015-09-28 08:04:25 | Re: Partial Aggregation / GROUP BY before JOIN |