Re: [POC] hash partitioning

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: amul sul <sulamul(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>, David Steele <david(at)pgmasters(dot)net>, Greg Stark <stark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [POC] hash partitioning
Date: 2017-05-10 12:34:38
Message-ID: CAFjFpRfgVF98W5KuOFGWnJNfBuDNa=TYLpZRdrX4riM8u+QE5A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, May 3, 2017 at 6:39 PM, amul sul <sulamul(at)gmail(dot)com> wrote:
> On Thu, Apr 27, 2017 at 1:42 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
>>
>> This is not yet a detailed review - I may be missing things, and
>> review and commentary from others is welcome. If there is no major
>> disagreement with the idea of moving forward using Amul's patch as a
>> base, then I will do a more detailed review of that patch (or,
>> hopefully, an updated version that addresses the above comments).
>

I agree that Amul's approach makes dump/restore feasible whereas
Nagata-san's approach makes that difficult. That is a major plus point
about Amul's patch. Also, it makes it possible to implement
Nagata-san's syntax, which is more user-friendly in future.

Here are some review comments after my initial reading of Amul's patch:

Hash partitioning will partition the data based on the hash value of the
partition key. Does that require collation? Should we throw an error/warning if
collation is specified in PARTITION BY clause?

+ int *indexes; /* Partition indexes; in case of hash
+ * partitioned table array length will be
+ * value of largest modulus, and for others
+ * one entry per member of the datums array
+ * (plus one if range partitioned table) */
This may be rewritten as "Partition indexes: For hash partitioned table the
number of indexes will be same as the largest modulus. For list partitioned
table the number of indexes will be same as the number of datums. For range
partitioned table the number of indexes will be number of datums plus one.".
You may be able to reword it to a shorter version, but essentially we will have
separate description for each strategy.

I guess, we need to change the comments for the other members too. For example
"datums" does not contain tuples with key->partnatts attributes for hash
partitions. It contains a tuple with two attributes, modulus and remainder. We
may not want to track null_index separately since rows with NULL partition key
will fit in the partition corresponding to the hash value of NULL. OR may be we
want to set null_index to partition which contains NULL values, if there is a
partition created for corresponding remainder, modulus pair and set has_null
accordingly. Accordingly we will need to update the comments.

cal_hash_value() may be renamed as calc_has_value() or compute_hash_value()?

Should we change the if .. else if .. construct in RelationBuildPartitionDesc()
to a switch case? There's very less chance that we will support a fourth
partitioning strategy, so if .. else if .. may be fine.

+ int mod = hbounds[i]->modulus,
+ place = hbounds[i]->remainder;
Although there are places in the code where we separate variable declaration
with same type by comma, most of the code declares each variable with the data
type on separate line. Should variable "place" be renamed as "remainder" since
that's what it is ultimately?

RelationBuildPartitionDesc() fills up mapping array but never uses it. In this
code the index into mapping array itself is the mapping so it doesn't need to
be maintained separately like list partiioning case. Similary next_index usage
looks unnecessary, although that probably improves readability, so may be fine.

+ * for p_p1: satisfies_hash_partition(2, 1, pkey, value)
+ * for p_p2: satisfies_hash_partition(4, 2, pkey, value)
+ * for p_p3: satisfies_hash_partition(8, 0, pkey, value)
+ * for p_p4: satisfies_hash_partition(8, 4, pkey, value)
What the function builds is satisfies_hash_partition(2, 1, pkey). I don't see
code to add value as an argument to the function. Is that correct?

+ int modulus = DatumGetInt32(datum);
May be you want to rename this variable to greatest_modulus like in the other
places.

+ Assert(spec->modulus > 0 && spec->remainder >= 0);
I liked this assertion. Do you want to add spec->modulus > spec->reminder also
here?

+ char *strategy; /* partitioning strategy
+ ('hash', 'list' or 'range') */

We need the second line to start with '*'

+-- check validation when attaching list partitions
Do you want to say "hash" instead of "list" here?

I think we need to explain the reasoning behind this syntax somewhere
as a README or in the documentation or in the comments. Otherwise it's
difficult to understand how various pieces of code are related.

This is not full review. I am still trying to understand how the hash
partitioning implementation fits with list and range partitioning. I
am going to continue to review this patch further.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-05-10 13:01:25 Re: Removal of plaintext password type references
Previous Message Thomas Munro 2017-05-10 12:02:16 Re: transition table behavior with inheritance appears broken (was: Declarative partitioning - another take)