Re: Boolean partitions syntax

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp
Cc: jonathan(dot)katz(at)excoventures(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, david(dot)rowley(at)2ndquadrant(dot)com, robertmhaas(at)gmail(dot)com, peter(dot)eisentraut(at)2ndquadrant(dot)com, sfrost(at)snowman(dot)net, hornschnorter(at)gmail(dot)com, dilipbalaut(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Boolean partitions syntax
Date: 2018-04-19 11:35:43
Message-ID: 20180419.203543.118371377.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks for reviewing.

At Wed, 18 Apr 2018 19:27:16 +0900, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> wrote in <7ac6b44e-4638-3320-1512-f6c03a28d0f7(at)lab(dot)ntt(dot)co(dot)jp>
> Horiguchi-san,
>
> Thank you for updating the patch.
>
> On 2018/04/16 16:17, Kyotaro HORIGUCHI wrote:
> > the attached v6 patch differs only in gram.y since v5.
>
> Patch fails to compile, because it adds get_partition_col_collation to
> rel.h instead of partcache.h:
>
> src/include/utils/rel.h: In function ‘get_partition_col_collation’:
> src/include/utils/rel.h:594:12: error: dereferencing pointer to incomplete
> type ‘struct PartitionKeyData’
>
> PartitionKeyData definition was recently moved to partcache.h as part of
> the big partition code reorganization.

Ugg.. Thanks.

> > At Fri, 13 Apr 2018 18:55:30 +0900, Amit Langote wrote:
> >> Looking at the gram.y changes in the latest patch, I think there needs to
> >> be some explanatory comments about about the new productions -- u_expr,
> >> b0_expr, and c0_expr.
> >
> > I think I did that. And refactord the rules.
> >
> > It was a bother that some rules used c_expr directly but I
> > managed to replace all of them with a_expr by lowering precedence
> > of some ordinary keywords (PASSING, BY, COLUMNS and ROW). c_expr
> > is no loger used elsewhere so we can just remove columnref from
> > c_expr. Finally [abc]0_expr was eliminated and we have only
> > a_expr, b_expr, u_expr and c_expr. This seems simple enough.
> >
> > The relationship among the rules after this change is as follows.
> >
> > a_expr --+-- columnref
> > +-- u_expr ---+-- c_expr -- (all old c_expr stuff except columnref)
> > +-- (all old a_expr stuff)
> >
> > b_expr --+-- columnref
> > +-- c_expr -- (ditto)
> > +-- (all old b_expr stuff)
> >
> > On the way fixing this, I noticed that the precedence of some
> > keywords (PRESERVE, STRIP_P) that was more than necessary and
> > corrected it.
>
> Thank you for simplifying gram.y changes. I think I was able to
> understand them. Having said that, I think your proposed patch to gram.y
> could be rewritten such that they do not *appear* to impact the syntax for
> other features like XML, etc. For example, allowing a_expr in places
> where previously only c_expr was allowed seems to me a dangerous, although
> I don't have any examples to show for it.

It cannot be dangerous, since "(a_expr)" is a c_expr. The
difference is just the way resolve conflicts. The words of which
I changed precedence are used only in the the problematic
contexts.

> It seems that we would like to use a_expr syntax but without columnref for
> partition_bound expressions. No columnrefs because they cause conflicts
> with unreserved keywords MINVALUE and MAXVALUE that are used in range

Right.

> bound productions. I think that can be implemented with minimal changes
> to expression syntax productions, as shown in the attached patch.

Agreed that the refactor stuff can be separated, but I'm a bit
uneasy with the increased number of new syntax. The purpose of
the change of c_expr in v6 was to minimize the *structure* change
of *_expr syntax. I agree that v8 style is preferable to
backpatch to PG10, but I'd still like to use v6 style for PG11.

> About the changes in transformPartitionBoundValue() to check for collation
> conflict, I think that seems unnecessary. We're not going to use the
> partition bound expression's collation anywhere, so trying to validate it
> seems unnecessary. I wondered if we should issue a WARNING to warn the
> user that COLLATE specified in a partition bound expression is ignored,

That's definitely wrong.

Partition key expression is inheriting the collation of the base
column and any collation can be specifiable on the expression.
Similary we can specify collation on partition bound values with
this patch. If a key expression and a bound value have
contradicting collations, we cannot determine the ordering of the
values and should error out.

Collation is requried for those languages where the letter
ordering is different from ordinary order. For example,

=# create table p (a text collate "sv_SE") partition by range (a);
=# create table c1 partiton of p for values from ('x') to ('ö');

The above is accepted in V10 but,

=# create table p (a text collate "en_US") partition by range (a);
postgres=# create table c1 partiton of p for values from ('x') to ('ö');
ERROR: syntax error at or near "partiton"
LINE 1: create table c1 partiton of p for values from ('x') to ('ö')...

This is because the collation of the key column is actually
*used* to check the range bounds. With this fix partition bound
values can have collate specification and if it differs from key
collation, it is just a mess.

> but not sure after seeing that we issue none when a user specifies
> COLLATION in the expression for a column's DEFAULT value. We do still
> store the COLLATION expression in pg_attrdef, but it doesn't seem to be
> used anywhere.

DEFAULT value is not compared with anything, partition key
expression *is* compared with column value and the collation is
applied on the comparison.

In the following case, the comparison between 'zirkoniumet' and
'z', 'å' must be performed with "sv_SE" since the range is empty
if the collation were en_US and it is not the intention of the
user.

=# create table p (a text) partition by range (a collate "sv_SE");
=# create table c1 partition of p for values from ('z') to ('å');
=# insert into p values ('zirkoniumet');

> Please find attached an updated version of your patch. I think we'll need
> to make some documentation changes and think about a way to back-patch
> this to PG10.

Of course documentation change is required. So, I'm going the
following direction.

For PG11

- Modify gram.y as v6. (or v8 and separate refactor patch for
PG12 if rejected:p)
- Add collation-check code as v6
- + documentaion

For PG10

- Modify gram.y as v8
- Add collation-check code as v6
- + documentation

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Lakhin 2018-04-19 12:25:54 Re: make installcheck-world in a clean environment
Previous Message Amit Langote 2018-04-19 11:32:45 Re: Should we add GUCs to allow partition pruning to be disabled?