Re: pgbench--new transaction type

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Greg Smith <greg(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: pgbench--new transaction type
Date: 2012-07-01 21:03:34
Message-ID: CAMkU=1zHL1djt2-KFyRAYKjWw1izg6OD-nt=BDP2h_3wUmQ=Og@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jun 20, 2012 at 12:32 AM, Heikki Linnakangas
<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> On 01.06.2012 03:02, Jeff Janes wrote:
>>
>> I've attached a new patch which addresses several of your concerns,
>> and adds the documentation.  The description is much longer than the
>> descriptions of other nearby options, which mostly just give a simple
>> statement of what they do rather than a description of why that is
>> useful.  I don't know if that means I'm starting a good trend, or a
>> bad one, or I'm just putting the exposition in the wrong place.
>>
>> In addition to showing the benefits of coding things on the server
>> side when that is applicable, it also allows hackers to stress parts
>> of the server code that are not easy to stress otherwise.
>
>
> As you mentioned in your original email over a year ago, most of this could
> be done as a custom script. It's nice to have another workload to test, but
> then again, there's an infinite number of workloads that might be
> interesting.

I would argue that my specific proposed transaction does occupy a
somewhat privileged place, as it uses the default table structure, and
it fits on a natural progression along the default, -N, -S continuum.
In each step one (or more) source of bottleneck is removed, to allow
different ones to bubble up to the top to be inspected and addressed.

I've written dozens of custom benchmark scripts, and so far this is
the only one I've thought was generally useful enough to think that it
belongs inside the core.

Do I need to make a better argument for why this particular
transaction is as generally useful as I think it is; or is the
objection to the entire idea that any new transactions should be added
at all?

> You can achieve the same with this custom script:
>
> \set loops 512
>
> do $$ DECLARE  sum integer default 0; amount integer; account_id integer;
> BEGIN FOR i IN 1..:loops LOOP   account_id=1 + floor(random() * :scale);
> SELECT abalance into strict amount FROM pgbench_accounts WHERE aid =
> account_id;   sum := sum + amount; END LOOP; END; $$;
>
> It's a bit awkward because it has to be all on one line, and you don't get
> the auto-detection of scale. But those are really the only differences
> AFAICS.

True. And we could fix the one-line thing by allowing lines ending in
"\" to be continued. And this might be mostly backwards compatible,
as I don't think there is much of a reason to end a statement with a
literal "\".

We could make the scale be auto-detected, but that would be more
dangerous as a backward compatibility thing, as people with existing
custom scripts might rely :scale being set to 1.

But it would defeat my primary purpose of making it easy for us to ask
some else posting on this list or on "performance" to run this. "Copy
this file someplace, then make sure you use the correct scale, and
remember where you put it, etc." is a much higher overhead.

Also, my patch changes the output formatting in a way that couldn't be
done with a custom script, and might be very confusing if it were not
changed. I don't how good of an argument this is. "Remember to
multiply the reported TPS by 512" is another barrier to easy use.

>
> I think we would be better off improving pgbench to support those things in
> custom scripts. It would be nice to be able to write initialization steps
> that only run once in the beginning of the test.

I can think of 3 possibly desirable behaviors. "Run once at -i time",
"Run once per pgbench run", and "run once per connection". Should all
of those be implemented?

> You could then put the
> "SELECT COUNT(*) FROM pgbench_branches" there, to do the scale
> auto-detection.

So, we should add a way to do "\set variable_name <SQL QUERY>"?

Would we use a command other than \set to do that, or look at the
thing after the variable to decide if it is a query rather than a
simple expression?

Cheers,

Jeff

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Nils Goroll 2012-07-01 21:28:30 Re: Update on the spinlock->pthread_mutex patch experimental: replace s_lock spinlock code with pthread_mutex on linux
Previous Message Jeff Janes 2012-07-01 20:25:27 Re: Update on the spinlock->pthread_mutex patch experimental: replace s_lock spinlock code with pthread_mutex on linux