Re: pgbench logging broken by time logic changes

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Gregory Smith <gregsmithpgsql(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, david(dot)christensen(at)crunchydata(dot)com
Subject: Re: pgbench logging broken by time logic changes
Date: 2021-07-08 12:17:28
Message-ID: CA+hUKG+2NAmDZ9g4GMjy939AsVifj117O40M1h7zUj02bzwjcA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jul 1, 2021 at 8:50 PM Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> wrote:
> Attached a v14 in that spirit.

Thanks! This doesn't seem to address the complaint, though. Don't
you need to do something like this? (See also attached.)

+ initStats(&aggs, start - (start + epoch_shift) % 1000000);

That should reproduce what pgbench 13 does implicitly when it uses
time(NULL). Namely, it rewinds to the start of the current *wall
clock* second, so that all future aggregates also start at round
number wall clock times, at the cost of making the first aggregate
miss out on a fraction of a second.

I wonder if some of the confusion on the other thread about the final
aggregate[1] was due to this difference. By rounding down, we get a
"head start" (because the first aggregate is short), so we usually
manage to record the expected number of aggregates before time runs
out. It's a race though. Your non-rounding version was more likely
to lose the race and finish before the final expected aggregate was
logged, so you added code to force a final aggregate to be logged. Do
I have this right? I'm not entirely sure how useful a partial final
aggregate is (it's probably one you have to throw away, like the first
one, no? Isn't it better if we only have to throw away the first
one?). I'm not sure, but if we keep that change, a couple of very
minor nits: I found the "tx" parameter name a little confusing. Do
you think it's clearer if we change it to "final" (with inverted
sense)? For the final aggregate, shouldn't we call doLog() only if
agg->cnt > 0?

I think I'd be inclined to take that change back out though, making
this patch very small and net behaviour like pgbench 13, if you agree
with my explanation for why you had to add it and why it's not
actually necessary with the fixed rounding shown above. (And perhaps
in v15 we might consider other ideas like using hi-res times in the
log and not rounding, etc, a topic for later.)

I don't really see the value in the test that checks that $delay falls
in the range 1.5s - 2.5s and then ignores the result. If it hangs
forever, we'll find out about it, and otherwise no human or machine
will ever care about that test. I removed it from this version. Were
you really attached to it?

I made some very minor language tweaks in comments (we don't usually
shorten "benchmark" to "bench" in English, "series" keeps the -s in
singular (blame the Romans), etc).

I think we should make it clear when we mean the *Unix* epoch (a
comment "switch to epoch" isn't meaningful on its own, to me at
least), so I changed that in a few places.

[1] https://www.postgresql.org/message-id/alpine.DEB.2.22.394.2106102323310.3698412%40pseudo

Attachment Content-Type Size
v15-0001-Fix-bugs-in-pgbench-log-output.patch text/x-patch 9.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2021-07-08 12:19:38 Re: rand48 replacement
Previous Message vignesh C 2021-07-08 12:08:21 Re: [HACKERS] logical decoding of two-phase transactions