Re: Multi-Column List Partitioning

From: Nitin Jadhav <nitinjadhavpostgres(at)gmail(dot)com>
To: Amit Langote <amitlangote09(at)gmail(dot)com>, Zhihong Yu <zyu(at)yugabyte(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Multi-Column List Partitioning
Date: 2021-08-25 12:42:09
Message-ID: CAMm1aWaX4bOnNOp78j9DAE3T_dnrdY2fb+7z+uHYjD=JfSrczQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> The new list bound binary search and related comparison support
> function look a bit too verbose to me. I was expecting
> partition_list_bsearch() to look very much like
> partition_range_datum_bsearch(), but that is not the case. The
> special case code that you wrote in partition_list_bsearch() seems
> unnecessary, at least in that function. I'm talking about the code
> fragment starting with this comment:
>
> I will look at other parts of the patch next week hopefully. For
> now, attached is a delta patch that applies on top of your v1, which
> does:
>
> * Simplify partition_list_bsearch() and partition_lbound_datum_cmp()
> * Make qsort_partition_list_value_cmp simply call
> partition_lbound_datum_cmp() instead of having its own logic to
> compare input bounds
> * Move partition_lbound_datum_cmp() into partbounds.c as a static
> function (export seems unnecessary)
> * Add a comment for PartitionBoundInfo.isnulls and remove that for null_index

Yes. You are right. The extra code added in partition_list_bsearch()
is not required and thanks for sharing the delta patch. It looks good
to me and I have incorporated the changes in the attached patch.

> I guess you're perhaps trying to address the case where the caller
> does not specify the values for all of the partition key columns,
> which can happen when the partition pruning code needs to handle a set
> of clauses matching only some of the partition key columns. But
> that's a concern of the partition pruning code and so the special case
> should be handled there (if at all), not in the binary search function
> that is shared with other callers. Regarding that, I'm wondering if
> we should require clauses matching all of the partition key columns to
> be found for the pruning code to call the binary search, so do
> something like get_matching_hash_bounds() does:
>
> Do you think that trying to match list partitions even with fewer keys
> is worth the complexity of the implementation? That is, is the use
> case to search for only a subset of partition key columns common
> enough with list partitioning?
>
> If we do decide to implement the special case, remember that to do
> that efficiently, we'd need to require that the subset of matched key
> columns constitutes a prefix, because of the way the datums are
> sorted. That is, match all partitions when the query only contains a
> clause for b when the partition key is (a, b, c), but engage the
> special case of pruning if the query contains clauses for a, or for a
> and b.

Thanks for the suggestion. Below is the implementation details for the
partition pruning for multi column list partitioning.

In the existing code (For single column list partitioning)
1. In gen_partprune_steps_internal(), we try to match the where
clauses provided by the user with the partition key data using
match_clause_to_partition_key(). Based on the match, this function can
return many values like PARTCLAUSE_MATCH_CLAUSE,
PARTCLAUSE_MATCH_NULLNESS, PARTCLAUSE_NOMATCH, etc.
2. In case of PARTCLAUSE_MATCH_CLAUSE, we generate steps using
gen_prune_steps_from_opexps() (strategy-2) which generate and return a
list of PartitionPruneStepOp that are based on OpExpr and BooleanTest
clauses that have been matched to the partition key and it also takes
care handling prefix of the partition keys.
3. In case of PARTCLAUSE_MATCH_NULLNESS, we generate steps using
gen_prune_step_op() (strategy-1) which generates single
PartitionPruneStepOp since the earlier list partitioning supports
single column and there can be only one NULL value. In
get_matching_list_bounds(), if the nullkeys is not empty, we fetch the
partition index which accepts null and we used to return from here.

In case of multi column list partitioning, we have columns more than
one and hence there is a possibility of more than one NULL values in
the where clauses. The above mentioned steps are modified like below.

1. Modified the match_clause_to_partition_key() to generate an object
of PartClauseInfo structure and return PARTCLAUSE_MATCH_CLAUSE even in
case of clauses related to NULL. The information required to generate
PartClauseInfo is populated here like the constant expression
consisting of (Datum) 0, op_strategy, op_is_ne, etc.
2. Since I am returning PARTCLAUSE_MATCH_CLAUSE, now we use strategy-2
(gen_prune_steps_from_opexps) to generate partition pruning steps.
This function takes care of generating a list of pruning steps if
there are multiple clauses and also takes care of handling prefixes.
3. Modified perform_pruning_base_step() to generate the datum values
and isnulls data of the where clauses. In case if any of the key
contains NULL value then the corresponding datum value is 0.
4. Modified get_matching_list_bounds() to generate the minimum offset
and/or maximum offset of the matched values based on the difference
operation strategies. Now since the NULL containing bound values are
part of 'boundinfo', changed the code accordingly to include the NULL
containing partitions or not in different scenarios like
InvalidStrategy, etc.

I have done some cosmetic changes to
v1_multi_column_list_partitioning.patch. So all the above code changes
related to partition pruning are merged with the previous patch and
also included the delta patch shared by you. Hence sharing a single
patch.

Kindly have a look and share your thoughts.

On Fri, Jun 11, 2021 at 10:57 PM Zhihong Yu <zyu(at)yugabyte(dot)com> wrote:
>
>
>
> On Thu, Jun 10, 2021 at 8:38 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
>>
>> Hi Nitin,
>>
>> On Thu, Jun 3, 2021 at 11:45 PM Nitin Jadhav
>> <nitinjadhavpostgres(at)gmail(dot)com> wrote:
>> > > I'll wait for you to post a new patch addressing at least the comments
>> > > in my earlier email. Also, please make sure to run `make check`
>> > > successfully before posting the patch. :)
>> >
>> > I have fixed all of the review comments given by you and Jeevan in the
>> > attached patch and also the attached patch contains more changes
>> > compared to the previous patch. Following are the implementation
>> > details.
>>
>> Thanks for the updated version.
>>
>> > 1. Regarding syntax, the existing syntax will work fine for the
>> > single-column list partitioning. However I have used the new syntax
>> > for the multi-column list partitioning as we discussed earlier. I have
>> > used a combination of 'AND' and 'OR' logic for the partition
>> > constraints as given in the below example.
>> >
>> > postgres(at)17503=#create table t(a int, b text) partition by list(a,b);
>> > CREATE TABLE
>> > postgres(at)17503=#create table t1 partition of t for values in ((1,'a'),
>> > (NULL,'b'));
>> > CREATE TABLE
>> > postgres(at)17503=#\d+ t
>> > Partitioned table "public.t"
>> > Column | Type | Collation | Nullable | Default | Storage |
>> > Compression | Stats target | Description
>> > --------+---------+-----------+----------+---------+----------+-------------+--------------+-------------
>> > a | integer | | | | plain |
>> > | |
>> > b | text | | | | extended |
>> > | |
>> > Partition key: LIST (a, b)
>> > Partitions: t1 FOR VALUES IN ((1, 'a'), (NULL, 'b'))
>> >
>> > postgres(at)17503=#\d+ t1
>> > Table "public.t1"
>> > Column | Type | Collation | Nullable | Default | Storage |
>> > Compression | Stats target | Description
>> > --------+---------+-----------+----------+---------+----------+-------------+--------------+-------------
>> > a | integer | | | | plain |
>> > | |
>> > b | text | | | | extended |
>> > | |
>> > Partition of: t FOR VALUES IN ((1, 'a'), (NULL, 'b'))
>> > Partition constraint: (((a = 1) AND (b = 'a'::text)) OR ((a IS NULL)
>> > AND (b = 'b'::text)))
>> > Access method: heap
>>
>> The constraint expressions seem to come out correctly, though I
>> haven't checked your implementation closely yet.
>>
>> > 2. In the existing code, NULL values were handled differently. It was
>> > not added to the 'datums' variable, rather used to store the partition
>> > index directly in the 'null_index' variable. Now there is a
>> > possibility of multiple NULL values, hence introducing a new member
>> > 'isnulls' in the 'PartitionBoundInfoData' struct which indicates
>> > whether the corresponding element in the 'datums' is NULL. Now
>> > 'null_index' cannot be used directly to store the partition index, so
>> > removed it and made the necessary changes in multiple places.
>> >
>> > 3. I have added test cases for 'create table' and 'insert' statements
>> > related to multi-column list partitioning and these are working fine
>> > with 'make check'.
>> >
>> > 4. Handled the partition pruning code to accommodate these changes for
>> > single-column list partitioning. However it is pending for
>> > multi-column list partitioning.
>> >
>> > 5. I have done necessary changes in partition wise join related code
>> > to accommodate for single-column list partitioning. However it is
>> > pending for multi-column list partitioning.
>> >
>> > Kindly review the patch and let me know if any changes are required.
>>
>> The new list bound binary search and related comparison support
>> function look a bit too verbose to me. I was expecting
>> partition_list_bsearch() to look very much like
>> partition_range_datum_bsearch(), but that is not the case. The
>> special case code that you wrote in partition_list_bsearch() seems
>> unnecessary, at least in that function. I'm talking about the code
>> fragment starting with this comment:
>>
>> /*
>> * Once we find the matching for the first column but if it does not
>> * match for the any of the other columns, then the binary search
>> * will not work in all the cases. We should traverse just below
>> * and above the mid index until we find the match or we reach the
>> * first mismatch.
>> */
>>
>> I guess you're perhaps trying to address the case where the caller
>> does not specify the values for all of the partition key columns,
>> which can happen when the partition pruning code needs to handle a set
>> of clauses matching only some of the partition key columns. But
>> that's a concern of the partition pruning code and so the special case
>> should be handled there (if at all), not in the binary search function
>> that is shared with other callers. Regarding that, I'm wondering if
>> we should require clauses matching all of the partition key columns to
>> be found for the pruning code to call the binary search, so do
>> something like get_matching_hash_bounds() does:
>>
>> /*
>> * For hash partitioning we can only perform pruning based on equality
>> * clauses to the partition key or IS NULL clauses. We also can only
>> * prune if we got values for all keys.
>> */
>> if (nvalues + bms_num_members(nullkeys) == partnatts)
>> {
>> /* code to compute matching hash bound offset */
>> }
>> else
>> {
>> /* Report all valid offsets into the boundinfo->indexes array. */
>> result->bound_offsets = bms_add_range(NULL, 0,
>> boundinfo->nindexes - 1);
>> }
>>
>> Do you think that trying to match list partitions even with fewer keys
>> is worth the complexity of the implementation? That is, is the use
>> case to search for only a subset of partition key columns common
>> enough with list partitioning?
>>
>> If we do decide to implement the special case, remember that to do
>> that efficiently, we'd need to require that the subset of matched key
>> columns constitutes a prefix, because of the way the datums are
>> sorted. That is, match all partitions when the query only contains a
>> clause for b when the partition key is (a, b, c), but engage the
>> special case of pruning if the query contains clauses for a, or for a
>> and b.
>>
>> I will look at other parts of the patch next week hopefully. For
>> now, attached is a delta patch that applies on top of your v1, which
>> does:
>>
>> * Simplify partition_list_bsearch() and partition_lbound_datum_cmp()
>> * Make qsort_partition_list_value_cmp simply call
>> partition_lbound_datum_cmp() instead of having its own logic to
>> compare input bounds
>> * Move partition_lbound_datum_cmp() into partbounds.c as a static
>> function (export seems unnecessary)
>> * Add a comment for PartitionBoundInfo.isnulls and remove that for null_index
>>
>> --
>> Amit Langote
>> EDB: http://www.enterprisedb.com
>
>
> Hi, Amit:
>
> + * isnulls is an array of boolean-tuples with key->partnatts booleans values
> + * each. Currently only used for list partitioning, it stores whether a
>
> I think 'booleans' should be 'boolean'.
> The trailing word 'each' is unnecessary.
>
> Cheers

Attachment Content-Type Size
v2-0001-multi-column-list-partitioning.patch application/octet-stream 104.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2021-08-25 13:03:53 Re: Added schema level support for publication.
Previous Message Masahiko Sawada 2021-08-25 12:40:12 Re: Failure of subscription tests with topminnow