Re: Multi-Column List Partitioning

From: Amul Sul <sulamul(at)gmail(dot)com>
To: Nitin Jadhav <nitinjadhavpostgres(at)gmail(dot)com>
Cc: Amit Langote <amitlangote09(at)gmail(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, Zhihong Yu <zyu(at)yugabyte(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Multi-Column List Partitioning
Date: 2021-12-06 14:52:04
Message-ID: CAAJ_b96_1B+Qi9qqpQT46vuH1M=iGGvs=VQG=3OPcr0njxw_jQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Dec 6, 2021 at 7:27 PM Nitin Jadhav
<nitinjadhavpostgres(at)gmail(dot)com> wrote:
>
> Thank you for reviewing the patch.
>
> > partbounds.c: In function ‘get_qual_for_list.isra.18’:
> > partbounds.c:4284:29: warning: ‘boundinfo’ may be used uninitialized
> > in this function [-Wmaybe-uninitialized]
> > datumCopy(bound_info->datums[i][j],
> > ~~~~~~~~~~^~~~~~~~
> > partbounds.c:4335:21: note: ‘boundinfo’ was declared here
> > PartitionBoundInfo boundinfo;
> > ^~~~~~~~~
> > partbounds.c: In function ‘partition_bounds_merge’:
> > partbounds.c:1305:12: warning: ‘inner_isnull’ may be used
> > uninitialized in this function [-Wmaybe-uninitialized]
> > bool *inner_isnull;
> > ^~~~~~~~~~~~
> > partbounds.c:1304:12: warning: ‘outer_isnull’ may be used
> > uninitialized in this function [-Wmaybe-uninitialized]
> > bool *outer_isnull;
> > ^~~~~~~~~~~~
>
> Fixed.
>
> > This function is unnecessarily complicated, I think you can avoid
> > inner for loops; simply replace for-loop-block with "if
> > (equal(lfirst(cell), new_bound)) return true".
>
> Thank you for the suggestion. Fixed.
>
> > + char **colname = (char **) palloc0(partnatts * sizeof(char *));
> > + Oid *coltype = palloc0(partnatts * sizeof(Oid));
> > + int32 *coltypmod = palloc0(partnatts * sizeof(int));
> > + Oid *partcollation = palloc0(partnatts * sizeof(Oid));
> > +
> > This allocation seems to be worthless, read ahead.
> >
> > I think there is no need for this separate loop inside
> > transformPartitionListBounds, you can do that same in the next loop as
> > well. And instead of get_partition_col_* calling and storing, simply
> > use that directly as an argument to transformPartitionBoundValue().
>
> Yes. The loop can be avoided and content of the above loop can be
> included in the next loop but the next loop iterates over a list of
> multi column datums. For each iteration, we need the information of
> all the columns. The above data (colname, coltype, coltypmod and
> partcollation) remains same for each iteration of the loop, If we
> modify as suggested, then the function to fetch these information has
> to be called every-time. To avoid this situation I have made a
> separate loop outside which only runs as many number of columns and
> stores in a variable which can be reused later. Please let me correct
> if I am wrong.
>

Ok, colname can be fetched in advance but I don't think it worth it to
fetch coltype, coltypmod & partcollation; and, store in the
explicitly allocated memory, instead, you can directly call
get_partition_col_* inline functions.

> > I think this should be inside the "else" block after "!IsA(rowexpr,
> > RowExpr)" error and you can avoid IsA() check too.
>
> This is required to handle the situation when one partition key is
> mentioned and multiple values are provided in the partition bound
> specification.
>
> > Looks difficult to understand at first glance, how about the following:
> >
> > if (b1->isnulls != b2->isnulls)
> > return false;
> >
> > if (b1->isnulls)
> > {
> > if (b1->isnulls[i][j] != b2->isnulls[i][j])
> > return false;
> > if (b1->isnulls[i][j])
> > continue;
> > }
> >
> > See how range partitioning infinite values are handled. Also, place
> > this before the comment block that was added for the "!datumIsEqual()"
> > case.
>
> Fixed. I feel the 'continue' block is not required and hence removed it.
>
> > Nothing wrong with this but if we could have checked "dest->isnulls"
> > instead of "src->isnulls" would be much better.
>
> Here we are copying the data from 'src' to 'dest'. If there is no data
> in 'src', it is unnecessary to copy. Hence checking 'src'.
>

I am not sure how that makes a difference since you do allocate 'dest'
based on 'src'; anyway, I leave that choice to you.

> > Condition "key->strategy != PARTITION_STRATEGY_LIST" seems to be unnecessary.
>
> Fixed.
>
> > Can't be a single loop?
>
> Yes. Fixed.
>

Thanks, will have a look.

Regards,
Amul

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2021-12-06 14:56:58 Re: MSVC SSL test failure
Previous Message Amit Langote 2021-12-06 14:34:29 Re: pg_get_publication_tables() output duplicate relid