Re: [POC] hash partitioning

From: amul sul <sulamul(at)gmail(dot)com>
To: Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, David Steele <david(at)pgmasters(dot)net>, Greg Stark <stark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [POC] hash partitioning
Date: 2017-07-03 11:09:51
Message-ID: CAAJ_b94NhwHj=juf2c9xOei3ht8KBX_FJTyuM1p9wq-cng_V_g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jun 23, 2017 at 10:11 AM, Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp> wrote:
> On Tue, 6 Jun 2017 13:03:58 +0530
> amul sul <sulamul(at)gmail(dot)com> wrote:
>
>
>> Updated patch attached.
>
> I looked into the latest patch (v13) and have some comments
> althogh they might be trivial.
>
Thanks for your review.

> First, I couldn't apply this patch to the latest HEAD due to
> a documentation fix and pgintend updates. It needes rebase.
>
> $ git apply /tmp/0002-hash-partitioning_another_design-v13.patch
> error: patch failed: doc/src/sgml/ref/create_table.sgml:87
> error: doc/src/sgml/ref/create_table.sgml: patch does not apply
> error: patch failed: src/backend/catalog/partition.c:76
> error: src/backend/catalog/partition.c: patch does not apply
> error: patch failed: src/backend/commands/tablecmds.c:13371
> error: src/backend/commands/tablecmds.c: patch does not apply
>
Fixed.

>
> <varlistentry>
> + <term>Hash Partitioning</term>
> +
> + <listitem>
> + <para>
> + The table is partitioned by specifying modulus and remainder for each
> + partition. Each partition holds rows for which the hash value of
> + partition keys when divided by specified modulus produces specified
> + remainder. For more clarification on modulus and remainder please refer
> + <xref linkend="sql-createtable-partition">.
> + </para>
> + </listitem>
> + </varlistentry>
> +
> + <varlistentry>
> <term>Range Partitioning</term>
>
> I think this section should be inserted after List Partitioning section because
> the order of the descriptions is Range, List, then Hash in other places of
> the documentation. At least,
>
Fixed in the attached version.

>
> - <firstterm>partition bounds</firstterm>. Currently supported
> - partitioning methods include range and list, where each partition is
> - assigned a range of keys and a list of keys, respectively.
> + <firstterm>partition bounds</firstterm>. The currently supported
> + partitioning methods are list, range, and hash.
> </para>
>
> Also in this hunk. I think "The currently supported partitioning methods are
> range, list, and hash." is better. We don't need to change the order of
> the original description.
>
Fixed in the attached version.

>
> <listitem>
> <para>
> - Declarative partitioning only supports list and range partitioning,
> - whereas table inheritance allows data to be divided in a manner of
> - the user's choosing. (Note, however, that if constraint exclusion is
> - unable to prune partitions effectively, query performance will be very
> - poor.)
> + Declarative partitioning only supports hash, list and range
> + partitioning, whereas table inheritance allows data to be divided in a
> + manner of the user's choosing. (Note, however, that if constraint
> + exclusion is unable to prune partitions effectively, query performance
> + will be very poor.)
>
> Similarly, I think "Declarative partitioning only supports range, list and hash
> partitioning," is better.
>
Fixed in the attached version.

>
> +
> + <para>
> + Create a hash partitioned table:
> +<programlisting>
> +CREATE TABLE orders (
> + order_id bigint not null,
> + cust_id bigint not null,
> + status text
> +) PARTITION BY HASH (order_id);
> +</programlisting></para>
> +
>
> This paragraph should be inserted between "Create a list partitioned table:"
> paragraph and "Ceate partition of a range partitioned table:" paragraph
> as well as range and list.
>
Fixed in the attached version.

>
> *strategy = PARTITION_STRATEGY_LIST;
> else if (pg_strcasecmp(partspec->strategy, "range") == 0)
> *strategy = PARTITION_STRATEGY_RANGE;
> + else if (pg_strcasecmp(partspec->strategy, "hash") == 0)
> + *strategy = PARTITION_STRATEGY_HASH;
> else
> ereport(ERROR,
>
> In the most of codes, the order is hash, range, then list, but only
> in transformPartitionSpec(), the order is list, range, then hash,
> as above. Maybe it is better to be uniform.
>
Make sense, fixed in the attached version.

>
> + {
> + if (strategy == PARTITION_STRATEGY_HASH)
> + ereport(ERROR,
> + (errcode(ERRCODE_UNDEFINED_OBJECT),
> + errmsg("data type %s has no default hash operator class",
> + format_type_be(atttype)),
> + errhint("You must specify a hash operator class or define a default hash operator class for the data type.")));
> + else
> + ereport(ERROR,
> + (errcode(ERRCODE_UNDEFINED_OBJECT),
> + errmsg("data type %s has no default btree operator class",
> + format_type_be(atttype)),
> + errhint("You must specify a btree operator class or define a default btree operator class for the data type.")));
> +
> +
>
> atttype,
> - "btree",
> - BTREE_AM_OID);
> + am_oid == HASH_AM_OID ? "hash" : "btree",
> + am_oid);
>
> How about writing this part as following to reduce code redundancy?
>
> + Oid am_oid;
> + char *am_name;
>
> <snip>
>
> + if (strategy == PARTITION_STRATEGY_HASH)
> + {
> + am_oid = HASH_AM_OID;
> + am_name = pstrdup("hash");
> + }
> + else
> + {
> + am_oid = BTREE_AM_OID;
> + am_name = pstrdup("btree");
> + }
> +
> if (!pelem->opclass)
> {
> - partopclass[attn] = GetDefaultOpClass(atttype, BTREE_AM_OID);
> + partopclass[attn] = GetDefaultOpClass(atttype, am_oid);
>
> if (!OidIsValid(partopclass[attn]))
> ereport(ERROR,
> (errcode(ERRCODE_UNDEFINED_OBJECT),
> - errmsg("data type %s has no default btree operator class",
> - format_type_be(atttype)),
> - errhint("You must specify a btree operator class or define a default btree operator class for the data type.")));
> + errmsg("data type %s has no default %s operator class",
> + format_type_be(atttype), am_name),
> + errhint("You must specify a %s operator class or define a default %s operator class for the data type.",
> + am_name, am_name)));
> +
> }
> else
> partopclass[attn] = ResolveOpClass(pelem->opclass,
> atttype,
> - "btree",
> - BTREE_AM_OID);
> + am_name,
> + am_oid);
>
I had to have same thoughts before (see v12 patch & before), but
change due to review comments upthread.

>
> There is meaningless indentation change.
>
> @@ -2021,7 +2370,8 @@ get_partition_for_tuple(PartitionDispatch *pd,
> /* bsearch in partdesc->boundinfo */
> cur_offset = partition_bound_bsearch(key,
> partdesc->boundinfo,
> - values, false, &equal);
> + values, false, &equal);
> +
> /*
> * Offset returned is such that the bound at offset is
>
Fixed in the attached version.

>
> Fixing the comment of pg_get_partkeydef() is missing.
>
> * pg_get_partkeydef
> *
> * Returns the partition key specification, ie, the following:
> *
> * PARTITION BY { RANGE | LIST } (column opt_collation opt_opclass [, ...])
> */
> Datum
> pg_get_partkeydef(PG_FUNCTION_ARGS)
> {
>
Thanks to catching this, fixed in the attached version.

Regards,
Amul

Attachment Content-Type Size
0001-Cleanup_v6.patch application/octet-stream 4.0 KB
0002-hash-partitioning_another_design-v14.patch application/octet-stream 77.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message amul sul 2017-07-03 11:12:11 Re: [POC] hash partitioning
Previous Message Amit Kapila 2017-07-03 10:40:44 Re: hash index on unlogged tables doesn't behave as expected