Re: pgbench - extend initialization phase control

From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: btendouan <btendouan(at)oss(dot)nttdata(dot)com>, "ibrar(dot)ahmad(at)gmail(dot)com:" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pgbench - extend initialization phase control
Date: 2019-10-31 14:54:00
Message-ID: alpine.DEB.2.21.1910310904420.27369@lancre
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Hello Masao-san,

> + snprintf(sql, sizeof(sql),
> + "insert into pgbench_branches(bid,bbalance) "
> + "select bid, 0 "
> + "from generate_series(1, %d) as bid", scale);
>
> "scale" should be "nbranches * scale".

Yep, even if nbranches is 1, it should be there.

> + snprintf(sql, sizeof(sql),
> + "insert into pgbench_accounts(aid,bid,abalance,filler) "
> + "select aid, (aid - 1) / %d + 1, 0, '' "
> + "from generate_series(1, %d) as aid", naccounts, scale * naccounts);
>
> Like client-side data generation, INT64_FORMAT should be used here
> instead of %d?

Indeed.

> If large scale factor is specified, the query for generating pgbench_accounts
> data can take a very long time. While that query is running, operators may be
> likely to do Ctrl-C to cancel the data generation. In this case, IMO pgbench
> should cancel the query, i.e., call PQcancel(). Otherwise, the query will keep
> running to the end.

Hmmm. Why not. Now the infra to do that seems to already exists twice,
once in "src/bin/psql/common.c" and once in "src/bin/scripts/common.c".

I cannot say I'm thrilled to replicate this once more. I think that the
reasonable option is to share this in fe-utils and then to reuse it from
there. However, ISTM that such a restructuring patch which not belong to
this feature.

> - for (step = initialize_steps; *step != '\0'; step++)
> + for (const char *step = initialize_steps; *step != '\0'; step++)
>
> Per PostgreSQL basic coding style,

C99 (20 years ago) is new the norm, and this style is now allowed, there
are over a hundred instances of these already. I tend to use that where
appropriate.

> - fprintf(stderr, "unrecognized initialization step \"%c\"\n",
> + fprintf(stderr,
> + "unrecognized initialization step \"%c\"\n"
> + "Allowed step characters are: \"" ALL_INIT_STEPS "\".\n",
> *step);
> - fprintf(stderr, "allowed steps are: \"d\", \"t\", \"g\", \"v\",
> \"p\", \"f\"\n");
>
> The original message seems better to me. So what about just appending "G"
> into the above latter message? That is,
> "allowed steps are: \"d\", \"t\", \"g\", \"G\", \"v\", \"p\", \"f\"\n"

I needed this list in several places, so it makes sense to share the
definition, and frankly the list of half a dozen comma-separated chars
does not strike me as much better than just giving the allowed chars
directly. So the simpler the better, from my point of view.

> Isn't it better to explain a bit more what "client-side / server-side data
> generation" is? For example, something like

Ok.

Attached v7 does most of the above, but the list of char message and the
signal handling. The first one does not look really better to me, and the
second one belongs to a restructuring patch that I'll try to submit.

--
Fabien.

Attachment Content-Type Size
pgbench-init-extended-7.patch text/x-diff 8.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ibrar Ahmed 2019-10-31 14:54:25 Re: [BUG] Partition creation fails after dropping a column and adding a partial index
Previous Message Andres Freund 2019-10-31 14:53:20 Re: function calls optimization