Re: pgbench: Skipping the creating primary keys after initialization

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pgbench: Skipping the creating primary keys after initialization
Date: 2017-08-30 02:32:10
Message-ID: CAD21AoCEsZbeMp56g6hbhzvEc45npdx+bRuHrUf61r-J-Mx1Ow@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 29, 2017 at 4:47 PM, Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> wrote:
>
> Hello,
>
> Patch applies and works.
>
>>> I would like to insist a little bit: the name was declared as an int but
>>> passed to init and used as a bool there before the patch. Conceptually
>>> what
>>> is meant is really a bool, and I see no added value bar saving a few
>>> strokes
>>> to have an int. ISTM that recent pgbench changes have started turning old
>>> int-for-bool habits into using bool when bool is meant.
>>
>>
>> Since is_no_vacuum is a existing one, if we follow the habit we should
>> change other similar variables as well: is_init_mode, do_vacuum_accounts and
>> debug. And I think we should change them in a separated patch.
>
>
> Hmmm. The existing "is_no_vacuum" variable is typed *both* as int (in
> "main") and as bool (in "init"), called by main (yuk!). I see no reason to
> choose the bad one for the global:-)
>

Yeah, I think this might be a good timing to re-consider int-for-bool
habits in pgbench. If we decided to change is_no_vacuum to bool I want
to change other similar variables as well.

>> After more thought, I'm bit inclined to not have a short option for
>> --custom-initialize because this option will be used for not primary
>> cases. It would be better to save short options for future
>> enhancements of pgbench. Thought?
>
>
> I like it as is, especially as now the associated value is a simple and
> short string, I think that it makes sense to have a simple and short option
> to trigger it. Moreover -I stands cleanly for "initialization", and the
> capital stands for something a little special which it is. Its to good to
> miss.
>
> I think that the "-I" it should be added to the "--help" line, as it is done
> with other short & long options.

Okay, I'll leave it as of now. Maybe we can discuss later.

>
> Repeating "-I f" results in multiple foreign key constraints:
>
> Foreign-key constraints:
> "pgbench_tellers_bid_fkey" FOREIGN KEY (bid) REFERENCES
> pgbench_branches(bid)
> "pgbench_tellers_bid_fkey1" FOREIGN KEY (bid) REFERENCES
> pgbench_branches(bid)
> "pgbench_tellers_bid_fkey2" FOREIGN KEY (bid) REFERENCES
> pgbench_branches(bid)
> "pgbench_tellers_bid_fkey3" FOREIGN KEY (bid) REFERENCES
> pgbench_branches(bid)
>
> I wonder if this could be avoided easily? Maybe by setting the constraint
> name explicitely so that the second one fails on the existing one, which is
> fine, like for primary keys? Or adding a DROP CONSTRAINT IF EXISTS before
> the CREATE CONSTRAINT, like for tables? Or doing nothing about it? I would
> prefer the first option.

Good point, I agree with first option.

>
> Maybe the initial cleanup (DROP TABLE) could be made an option added to the
> default, so that cleaning up the database could be achieved with some
> "pgbench -i -I c", instead of connecting and droping the tables one by one
> which I have done quite a few times... What do you think?

Yeah, I sometimes wanted that. Having the cleaning up tables option
would be good idea.

> Before it is definitely engraved, I'm thinking about the letters:
>
> c - cleanup
> t - create table
> d - data
> p - primary key
> f - foreign key
> v - vacuum
>
> I think it is mostly okay, but it is the last time to think about it. Using
> "d" for cleanup (drop) would mean finding another letter for filling in
> data... maybe "g" for data generation? "c" may have been chosen for "create
> table", but then would not be available for "cleanup". Thoughts?

I'd say "g" for data generation would be better. Also, I'm inclined to
add a command for the unlogged tables. How about this?

c - cleanup
t - create table
u - create unlogged table
g - data generation
p - primary key
f - foreign key
v - vacuum

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2017-08-30 02:36:19 Re: expanding inheritance in partition bound order
Previous Message Thomas Munro 2017-08-30 02:22:21 Re: A design for amcheck heapam verification