Re: pgbench - minor fix for meta command only scripts

From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pgbench - minor fix for meta command only scripts
Date: 2016-09-24 09:45:08
Message-ID: alpine.DEB.2.20.1609241112330.6473@lancre
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Hello Heikki,

> Yeah, it really is quite a mess. I tried to review your patch, and I think
> it's correct, but I couldn't totally convince myself, because of the existing
> messiness of the logic. So I bit the bullet and started refactoring.
>
> I came up with the attached. It refactors the logic in doCustom() into a
> state machine. I think this is much clearer, what do you think?

The patch did not apply to master because of you committed the sleep fix
in between. I updated the patch so that the fix is included as well.

I think that this is really needed. The code is much clearer and simple to
understand with the state machines & additional functions. This is a
definite improvement to the code base.

I've done quite some testing with various options (-r, --rate,
--latency-limit, -C...) and got pretty reasonnable results.

Although I cannot be absolutely sure that the refactoring does not
introduce any new bug, I'm convinced that it will be much easier to find
them:-)

Attached are some small changes to your version:

I have added the sleep_until fix.

I have fixed a bug introduced in the patch by changing && by || in the
(min_sec > 0 && maxsock != -1) condition which was inducing errors with
multi-threads & clients...

I have factored out several error messages in "commandFailed", in place of
the "metaCommandFailed", and added the script number as well in the error
messages. All messages are now specific to the failed command.

I have added two states to the machine:

- CSTATE_CHOOSE_SCRIPT which simplifies threadRun, there is now one call
to chooseScript instead of two before.

- CSTATE_END_COMMAND which manages is_latencies and proceeding to the
next command, thus merging the three instances of updating the stats
that were in the first version.

The later state means that processing query results is included in the per
statement latency, which is an improvement because before I was getting
some transaction latency significantly larger that the apparent sum of the
per-statement latencies, which did not make much sense...

I have added & updated a few comments. There are some places where the
break could be a pass through instead, not sure how desirable it is, I'm
fine with break.

> Well, the comment right there says "note this is not included in the
> statement latency numbers", so apparently it's intentional. Whether it's a
> good idea or not, I don't know :-). It does seem a bit surprising.

Indeed, it also results in apparently inconsistent numbers, and it creates
a mess for recording the statement latency because it meant that in some
case the latency was collected before the actual end of the command, see
the discussion about CSTATE_END_COMMAND above.

> But what seems more bogus to me is that we do that after recording the
> *transaction* stats, if this was the last command. So the PQgetResult() of
> the last command in the transaction is not included in the transaction stats,
> even though the PQgetResult() calls for any previous commands are. (Perhaps
> that's what you meant too?)
>
> I changed that in my patch, it would've been inconvenient to keep that old
> behavior, and it doesn't make any sense to me anyway.

Fine with me.

--
Fabien.

Attachment Content-Type Size
pgbench-refactor-2.patch text/x-diff 36.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2016-09-24 10:13:31 Re: ICU integration
Previous Message Thomas Munro 2016-09-24 09:07:32 Re: Refactor StartupXLOG?