Re: pgbench - implement strict TPC-B benchmark

From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Jonah H(dot) Harris" <jonah(dot)harris(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pgbench - implement strict TPC-B benchmark
Date: 2019-08-02 06:38:37
Message-ID: alpine.DEB.2.21.1908011649140.2692@lancre
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Hello Robert,

>> All in all, pgbench overheads are small compared to postgres processing
>> times and representative of a reasonably optimized client application.
>
> It's pretty easy to devise tests where pgbench is client-limited --
> just try running it with threads = clients/4, sometimes even
> clients/2. So I don't buy the idea that this is true in general.

Ok, one thread cannot feed an N core server if enough client are executed
per thread and the server has few things to do.

The point I'm clumsily trying to make is that pgbench-specific overheads
are quite small: Any benchmark driver would have pretty much at least the
same costs, because you have the cpu cost of the tool itself, then the
library it uses, eg lib{pq,c}, then syscalls. Even if the first costs are
reduced to zero, you still have to deal with the database through the
system, and this part will be the same.

As the cost of pgbench itself in a reduced part of the total cpu costs of
running the bench client side, there is no extraordinary improvement to
expect by optimizing this part. This does not mean that pgbench
performance should not be improved, if possible and maintainable.

I'll develop a little more that point in an answer to Andres figures,
which are very interesting, by providing some more figures.

>> To try to salvage my illustration idea: I could change the name to "demo",
>> i.e. quite far from "TPC-B", do some extensions to make it differ, eg use
>> a non-uniform random generator, and then explicitly say that it is a
>> vaguely inspired by "TPC-B" and intended as a demo script susceptible to
>> be updated to illustrate new features (eg if using a non-uniform generator
>> I'd really like to add a permutation layer if available some day).
>>
>> This way, the "demo" real intention would be very clear.
>
> I do not like this idea at all; "demo" is about as generic a name as
> imaginable.

What name would you suggest, if it were to be made available from pgbench
as a builtin, that avoids confusion with "tpcb-like"?

> But I have another idea: how about including this script in the
> documentation with some explanatory text that describes (a) the ways in
> which it is more faithful to TPC-B than what the normal pgbench thing
> and (b) the problems that it doesn't solve, as enumerated by Fabien
> upthread:

We can put more examples in the documentation, ok.

One of the issue raised by Tom is that claiming faithfulness to TCP-B is
prone to legal issues. Franckly, I do not care about TPC-B, only that it
is a *real* benchmark, and that it allows to illustrate pgbench
capabilities.

Another point is confusion if there are two tpcb-like scripts provided.

So I'm fine with giving up any claim about faithfulness, especially as it
would allow the "demo" script to be more didactic and illustrate more
of pgbench capabilities.

> "table creation and data types, performance data collection, database
> configuration, pricing of hardware used in the tests, post-benchmark run
> checks, auditing constraints, whatever…"

I already put such caveats in comments and in the documentation, but that
does not seem to be enough for Tom.

> Perhaps that idea still won't attract any votes, but I throw it out
> there for consideration.

I think that adding an illustration section could be fine, but ISTM that
it would still be appropriate to have the example executable. Moreover, I
think that your idea does not fixes the "we need not to make too much
claims about TPC-B to avoid potential legal issues".

--
Fabien.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Shawn Wang 2019-08-02 06:40:56 Re: WIP: Data at rest encryption
Previous Message Kyotaro Horiguchi 2019-08-02 06:32:19 Re: [HACKERS] WAL logging problem in 9.4.3?