Fixing pgbench's logging of transaction timestamps

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: Fixing pgbench's logging of transaction timestamps
Date: 2016-12-31 20:40:39
Message-ID: 8837.1483216839@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

pgbench's -l option is coded using inappropriate familiarity with the
contents of struct instr_time. I recall complaining about that when
the code went in, but to little avail. However, it needs to be fixed
if we're to switch over to using clock_gettime() as discussed in
the gettimeofday thread,
https://www.postgresql.org/message-id/17524.1483063359@sss.pgh.pa.us
And really it's unacceptably bad code even without that consideration,
because it doesn't work on Windows.

There are at least three ways we could fix it:

1. Switch over to printing the timestamp in the form of elapsed seconds
since the pgbench run start, as in the attached draft patch (which is
code-complete but lacks necessary documentation changes). You could make
an argument that this is a better definition than what's there: in most
situations, people are going to want the elapsed time, and right now they
have to do painful manual arithmetic to get it. About the only reason
I can see for liking the current definition is that it makes it possible
to correlate the pgbench log with external events, and I'm not sure
whether that's especially useful.

2. Have pgbench save both the INSTR_TIME_SET_CURRENT() and gettimeofday()
results for the run start instant. In doLog(), compute the elapsed time
from run start much as in the attached patch, but then add it to the saved
gettimeofday() result and print that in the existing format. This would
preserve the existing output format at the cost of a very small amount
of extra arithmetic per log line. However, it's got a nasty problem if
we use clock_gettime(CLOCK_MONOTONIC) in instr_time, which I think would
be the typical case. To the extent that CLOCK_MONOTONIC diverges from
CLOCK_REALTIME, which it would in case of external adjustments to the
system clock, the printed timestamps would no longer match the actual
system clock, which destroys the argument that you could correlate the
pgbench log with other events. (I imagine the same problem would apply
in the Windows implementation.)

3. Forget about using the instr_time result and just have doLog() execute
gettimeofday() to obtain the timestamp to print. This is kind of
conceptually ugly, but realistically the added overhead is probably
insignificant. A larger objection might be that on Windows, the result
of gettimeofday() isn't very high precision ... but it'd still be a huge
improvement over the non-answer you get now.

I'm inclined to think that #2 isn't a very good choice; it appears to
preserve the current behavior but really doesn't. So we should either
change the behavior as in #1 or expend an extra system call as in #3.
Preferences?

regards, tom lane

Attachment Content-Type Size
pgbench-log-delta-time.patch text/x-diff 2.2 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2016-12-31 23:47:11 Re: Make pg_basebackup -x stream the default
Previous Message Tom Lane 2016-12-31 18:38:55 Re: safer node casting