Re: pgbench regression test failure

From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Developers <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: pgbench regression test failure
Date: 2017-09-12 19:21:50
Message-ID: alpine.DEB.2.20.1709122102320.4555@lancre
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> I have a serious, serious dislike for tests that seem to work until
> they're run on a heavily loaded machine.

I'm not that sure the error message was because of that. ISTM that it was
rather finding 3 seconds in two because it started just at the right time,
or maybe because of slowness induce by load and the order in which the
different checks are performed.

> So unless there is some reason why pgbench is *guaranteed* to run at
> least one transaction per thread, I'd rather the test not assume that.

Well, pgbench is for testing performance... so if the checks allow zero
performance that's quite annoying as well:-) The tests are designed to
require very low performance (eg there are a lot of -t 1 when only one
transaction is enough to check a point), but maybe some test assume a
minimal requirement, maybe 10 tps with 2 threads...

> I would not necessarily object to doing something in the code that
> would guarantee that, though.

Hmmm. Interesting point.

There could be a client-side synchronization barrier, eg something like
"\sync :nclients/nthreads" could be easy enough to implement with pthread,
and quite error prone to use, but probably that could be okay for
validation purposes. Or maybe we could expose something at the SQL level,
eg "SELECT synchro('synchroname', whomanyclientstowait);" which would be
harder to implement server-side but possibly doable as well.

A simpler option may be to introduce a synchronization barrier at thread
start, so that all threads start together and that would set the "zero"
time. Not sure that would solve the potential issue you raise, although
that would help.

Currently the statistics collection and outputs are performed by thread 0
in addition to the client it runs, so that pgbench would work even if
there are no threads, but it also means that under a heavy load some
things may not be done on the target time but a little bit later, if some
thread is stuck somewhere. Although the async protocol try to avoid that.


In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-09-12 19:22:10 Re: psql - add special variable to reflect the last query status
Previous Message Andreas Joseph Krogh 2017-09-12 19:19:32 Re: Clarification in pg10's pgupgrade.html step 10 (upgrading standby servers)