Re: pgbench logging broken by time logic changes

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Gregory Smith <gregsmithpgsql(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, david(dot)christensen(at)crunchydata(dot)com
Subject: Re: pgbench logging broken by time logic changes
Date: 2021-07-11 08:16:56
Message-ID: CA+hUKGJtjMa84cr=TRak3k_Db3BXKZr6rNMVMmed4ipVA86xyw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Fabien,

I committed the code change without the new TAP tests, because I
didn't want to leave the open item hanging any longer. As for the
test, ...

On Sat, Jul 10, 2021 at 9:36 PM Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> wrote:
> >> I hoped we were done here but I realised that your check for 1-3 log
> >> lines will not survive the harsh environment of the build farm.
> >> Adding sleep(2) before the final doLog() confirms that. I had two
> >> ideas:

> I misread your point. You think that it should fail, but it is not
> tried yet. I'm rather optimistic that it should not fail, but I'm okay
> with averting the risk anyway.

... I know it can fail, and your v18 didn't fix that, because...

+check_pgbench_logs($bdir, '001_pgbench_log_1', $nthreads, 1, 3,

^
|

... this range can be exceeded.

That's because extra aggregations are created based on doLog() reading
the clock after a transaction is finished, entirely independently of
the -T mechanism deciding when to stop the benchmark, and potentially
many seconds later in adverse conditions. As I mentioned, you can see
it fail with your own eyes if you hack the code like so:

if (agg_interval > 0)
{
+ /*
+ * XXX: simulate an overloaded raspberry pi swapping to a microsd
+ * card or other random delays as we can expect in the build farm
+ */
+ sleep(3);
/* log aggregated but not yet reported transactions */
doLog(thread, state, &aggs, false, 0, 0);
}

> I stand by this solution which should allow to get some data from the
> field, as v18 attached. If all is green then the TODO could be removed
> later.

I suspect the number of aggregates could be made deterministic, as I
described in an earlier message. What do you think about doing
something like that first for the next release, before trying to add
assertions about the number of aggregates? I'm with you on the
importance of testing, but it seems better to start by making the
thing more testable.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dean Rasheed 2021-07-11 09:27:00 Re: Enhanced error message to include hint messages for redundant options error
Previous Message Michael Paquier 2021-07-11 04:16:05 Re: [PATCH] Pull general SASL framework out of SCRAM