Re: pgbench - allow to create partitioned tables

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: Asif Rehman <asifr(dot)rehman(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pgbench - allow to create partitioned tables
Date: 2019-09-10 05:48:12
Message-ID: CAA4eK1L8OjSJ7BRgoU-Up7j+FF0vZ6rCy4=ydXi7rztFVH1b_g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Aug 26, 2019 at 11:04 PM Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> wrote:
>
>
> > Just one suggestion --partition-method option should be made dependent
> > on --partitions, because it has no use unless used with --partitions.
> > What do you think?
>

Some comments:
*
+ case 11: /* partitions */
+ initialization_option_set = true;
+ partitions = atoi(optarg);
+ if (partitions < 0)
+ {
+ fprintf(stderr, "invalid number of partitions: \"%s\"\n",
+ optarg);
+ exit(1);
+ }
+ break;

Is there a reason why we treat "partitions = 0" as a valid value?
Also, shouldn't we keep some max limit for this parameter as well?
Forex. how realistic it will be if the user gives the value of
partitions the same or greater than the number of rows in
pgbench_accounts table? I understand it is not sensible to give such
a value, but I guess the API should behave sanely in such cases as
well. I am not sure what will be the good max value for it, but I
think there should be one. Anyone else have any better suggestions
for this?

*
@@ -3625,6 +3644,7 @@ initCreateTables(PGconn *con)
const char *bigcols; /* column decls if accountIDs are 64 bits */
int declare_fillfactor;
};
+
static const struct ddlinfo DDLs[] = {

Spurious line change.

*
+ " --partitions=NUM partition account table in NUM parts
(defaults: 0)\n"
+ " --partition-method=(range|hash)\n"
+ " partition account table with this
method (default: range)\n"

Refer complete table name like pgbench_accounts instead of just
account. It will be clear and in sync with what we display in some
other options like --skip-some-updates.

*
+ " --partitions=NUM partition account table in NUM parts
(defaults: 0)\n"

/defaults/default.

*
I think we should print the information about partitions in
printResults. It can help users while analyzing results.

*
+enum { PART_NONE, PART_RANGE, PART_HASH }
+ partition_method = PART_NONE;
+

I think it is better to follow the style of QueryMode enum by using
typedef here, that will make look code in sync with nearby code.

*
- int i;

fprintf(stderr, "creating tables...\n");

- for (i = 0; i < lengthof(DDLs); i++)
+ for (int i = 0; i < lengthof(DDLs); i++)

This is unnecessary change as far as this patch is concerned. I
understand there is no problem in writing either way, but let's not
change the coding pattern here as part of this patch.

*
+ if (partitions >= 1)
+ {
+ int64 part_size = (naccounts * (int64) scale + partitions - 1) / partitions;
+ char ff[64];
+ ff[0] = '\0';
+ append_fillfactor(ff, sizeof(ff));
+
+ fprintf(stderr, "creating %d partitions...\n", partitions);
+
+ for (int p = 1; p <= partitions; p++)
+ {
+ char query[256];
+
+ if (partition_method == PART_RANGE)
+ {

part_size can be defined inside "if (partition_method == PART_RANGE)"
as it is used here. In general, this part of the code can use some
comments.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2019-09-10 06:23:25 Re: Replication & recovery_min_apply_delay
Previous Message Michael Paquier 2019-09-10 05:35:53 Re: Change atoi to strtol in same place