Re: Desirable pgbench features?

From: Josh berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Desirable pgbench features?
Date: 2016-03-30 17:01:18
Message-ID: 56FC065E.9040908@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 03/30/2016 08:29 AM, Fabien wrote:

> (1) TPC-B test driver must obtain a value from a query (the branch is
> the one
> of the chosen teller, not any random branch) and reuse it in another
> query. Currently the communication is one way, all results are silently
> discarded.
>
> This is not representative of client applications (say a web app) which
> interact with the server with some logic of their own, including
> reading
> things and writing others depending on the previous reading.
>
> This may be simulated on server side with a plpgsql script, but that
> would not exercise the client/server protocol logic and its performance
> impact, so I think that this simple read capability is important and
> missing.

Yes. Particularly, one of the things I'd like to benchmark is
load-balancing between masters and replicas, including checks for
coherency. Without being able to retrieve and reuse values, this can't
be tested.

The simplest way I'd see doing this is being able to SELECT INTO a
pgbench variable.

> (5) Consistency check: after a run, some properties are expected to be
> true, such as the balances of branches is the balance of its
> tellers and also of its accounts... This should/could be checked,
> maybe with an additional query.

I'd also love to have a consistency check which would be client-only
which I could run in the pgbench unit itself. That is, a way to log
"errors" if, say, two variables were not equal at the end of the unit of
work.

An example of this would be using this to test if load-balanced
connections were getting "stale reads", especially since the
*percentage* of stale reads is what I want to know. 5% is acceptable,
50% is not.

> * using values from a query
>
> For this use case (1), the best syntax and implementation is unclear. In
> particular, I'm not fond of the \gset syntax used in psql because the ';'
> is dropped and the \gset seems to act as a statement terminator.
>
> After giving it some thought, I would suggest a simple two-line explicit
> syntax compatible with current conventions, with a SELECT statement
> terminated with a ';', on one side and where to put the results on the
> other, something like:
>
> SELECT ... ;
> \into some variable names

This works for me if it works for the parser.

>
> Or maybe in the other way around:
>
> \setsql some variable names
> SELECT ... ;

This also works, but is not my preference. It would be somewhat harder
to avoid variable/column mismatches.

One more wishlist item, which would make my request above for unit tests
unnecessary:

* Allow custom logging:

\vlog TAG varname1, varname2

Which would produce a custom log file called:

PID.TAG.varlog

With the format:

timestamp, var1, var2

e.g. if I had this:

SELECT id, abalance FROM account WHERE id = :aid
\into :lid, :lbal

\vlog balancelog :lid, :lbal

It would create a file called:

2247.balancelog.varlog

and/or append a line:

2016-03-30 21:37:33.899, 511, 2150

This would allow CSV logging of all sorts of user custom information,
including de-facto response times.

--
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jose Luis Tallon 2016-03-30 17:35:02 Re: pg_dump dump catalog ACLs
Previous Message Amit Langote 2016-03-30 16:51:51 Re: [GENERAL] pg_restore casts check constraints differently