Re: [HACKERS] pgbench regression test failure

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: Steve Singer <steve(at)ssinger(dot)info>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] pgbench regression test failure
Date: 2017-11-20 18:40:57
Message-ID: 19670.1511203257@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> writes:
>> 1. The per-script stats shouldn't be printed at all if there's
>> only one script. They're redundant with the overall stats.

> Indeed.
> I think the output should tend to be the same for possible automatic
> processing, whether there is one script or more, even at the price of some
> redundancy.
> Obviously this is highly debatable.

I think that ship has already sailed. It's certainly silly that the
code prints *only* per-script latency stats, and not all the per-script
stats, when there's just one script. To me the answer is to make the
latency stats conform to the rest, not make the printout even more
redundant. None of this output was designed for machine-friendliness.

(Maybe there is an argument for a "--machine-readable-output" switch
that would dump the data in some more-machine-friendly format. Though
I'm sure we'd have lots of debates about exactly what that is...)

>> 2. ISTM that we should report that 100% of the transactions were
>> above the latency limit, not 33%; that is, the appropriate base
>> for the "number of transactions above the latency limit" percentage
>> is the number of actual transactions not the number of scheduled
>> transactions.

> Hmmm. Allow me to disagree.

I dunno, it just looks odd to me that when we've set up a test case in
which every one of the transactions is guaranteed to exceed the latency
limit, that it doesn't say that they all did. I don't particularly buy
your assumption that the percentages should sum. Anybody else have an
opinion there?

>> I also noticed that if I specify "-f sleep-100.sql" more than once,
>> the per-script TPS reports are out of line. This is evidently because
>> that calculation isn't excluding skipped xacts; but if we're going to
>> define tps as excluding skipped xacts, surely we should do so there too.

> I do not think that we should exclude skipped xacts.

Uh ... why not?

>> I'm also still exceedingly unhappy about the NaN business.
>> You have this comment in printSimpleStats:
>> /* print NaN if no transactions where executed */
>> but I find that unduly optimistic. It should really read more like
>> "if no transactions were executed, at best we'll get some platform-
>> dependent spelling of NaN. At worst we'll get a SIGFPE."

> Hmmm. Alas you must be right about spelling. There has been no report of
> SIGFPE issue, so I would not bother with that.

The core issue here really is that you're assuming IEEE float arithmetic.
We have not gone as far as deciding that Postgres will only run on IEEE
hardware, and I don't want to start in pgbench, especially not in
seldom-exercised corner cases.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-11-20 18:49:58 Re: [PATCH] Porting small OpenBSD changes.
Previous Message Martín Marqués 2017-11-20 18:27:00 Using isatty() on WIN32 platform