Re: [POC] hash partitioning

From: amul sul <sulamul(at)gmail(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
Cc: Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Thom Brown <thom(at)linux(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>, David Steele <david(at)pgmasters(dot)net>, Greg Stark <stark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [POC] hash partitioning
Date: 2017-10-10 10:02:55
Message-ID: CAAJ_b96RTSDB5k=KnLbphjG=XdbubLj=Mhk_SKMaFyVdDoyzMA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Oct 9, 2017 at 5:51 PM, Ashutosh Bapat
<ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:
> On Mon, Oct 9, 2017 at 4:44 PM, amul sul <sulamul(at)gmail(dot)com> wrote:
>

Thanks Ashutosh for your review, please find my comment inline.

>
>> 0002 few changes in partition-wise join code to support
>> hash-partitioned table as well & regression tests.
>
> + switch (key->strategy)
> + {
> + case PARTITION_STRATEGY_HASH:
> + /*
> + * Indexes array is same as the greatest modulus.
> + * See partition_bounds_equal() for more explanation.
> + */
> + num_indexes = DatumGetInt32(src->datums[ndatums - 1][0]);
> + break;
> This logic is duplicated at multiple places. I think it's time we consolidate
> these changes in a function/macro and call it from the places where we have to
> calculate number of indexes based on the information in partition descriptor.
> Refactoring existing code might be a separate patch and then add hash
> partitioning case in hash partitioning patch.
>

Make sense, added get_partition_bound_num_indexes() to get number of index
elements in 0001 & get_greatest_modulus() as name suggested to get the greatest
modulus of the hash partition bound in 0002.

> + int dim = hash_part? 2 : partnatts;
> Call the variable as natts_per_datum or just natts?
>

natts represents the number of attributes, but for the hash partition bound we
are not dealing with the attribute so that I have used short-form of dimension,
thoughts?

> + hash_part? true : key->parttypbyval[j],
> + key->parttyplen[j]);
> parttyplen is the length of partition key attribute, whereas what you want here
> is the length of type of modulus and remainder. Is that correct? Probably we
> need some special handling wherever parttyplen and parttypbyval is used e.g. in
> call to partition_bounds_equal() from build_joinrel_partition_info().
>

Unless I am missing something, I don't think we should worry about parttyplen
because in the datumCopy() when the datatype is pass-by-value then typelen
is ignored.

Regards,
Amul

Attachment Content-Type Size
0001-partition_bounds_copy-code-refactoring-v1.patch application/octet-stream 2.1 KB
0002-hash-partitioning_another_design-v24.patch application/octet-stream 86.2 KB
0003-Enable-partition-wise-join-support-v2.patch application/octet-stream 10.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message amul sul 2017-10-10 10:10:22 Re: [POC] hash partitioning
Previous Message Ashutosh Bapat 2017-10-10 10:00:02 Re: Partition-wise aggregation/grouping