Re: Removing Functionally Dependent GROUP BY Columns

From: Julien Rouhaud <julien(dot)rouhaud(at)dalibo(dot)com>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Marko Tiikkaja <marko(at)joh(dot)to>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Removing Functionally Dependent GROUP BY Columns
Date: 2016-01-14 11:19:37
Message-ID: 56978449.5000503@dalibo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 14/01/2016 01:30, David Rowley wrote:
> Many thanks for the thorough review!
>

And thanks for the patch and fast answer!

> On 12 January 2016 at 23:37, Julien Rouhaud <julien(dot)rouhaud(at)dalibo(dot)com
> <mailto:julien(dot)rouhaud(at)dalibo(dot)com>> wrote:
>
>
> In identify_key_vars()
>
> + /* Only PK constraints are of interest for now, see comment
> above */
> + if (con->contype != CONSTRAINT_PRIMARY)
> + continue;
>
> I think the comment should better refer to
> remove_useless_groupby_columns() (don't optimize further than what
> check_functional_grouping() does).
>
>
> I've improved this by explaining it more clearly.
>

Very nice.

Small typo though:

+ * Technically we could look at UNIQUE indexes too, however we'd also
+ * have to ensure that each column of the unique index had a NOT NULL

s/had/has/

+ * constraint, however since NOT NULL constraints currently are don't

s/are //

>
> [...]
>
>
> + {
> + varlistmatches = bms_add_member(varlistmatches,
> varlistidx);
> + found_col = true;
> + /* don't break, there might be dupicates */
> + }
>
> small typo on "dupicates". Also, I thought transformGroupClauseExpr()
> called in the parse analysis make sure that there isn't any duplicate in
> the GROUP BY clause. Do I miss something?
>
>
> No I missed something :) You are right. I've put a break; here and a
> comment to explain why.
>

ok :)

> [...]
>
>
> + /*
> + * If we found any surplus Vars in the GROUP BY clause, then
> we'll build
> + * a new GROUP BY clause without these surplus Vars.
> + */
> + if (anysurplus)
> + {
> + List *new_groupby = NIL;
> +
> + foreach (lc, root->parse->groupClause)
> + {
> + SortGroupClause *sgc = (SortGroupClause *) lfirst(lc);
> + TargetEntry *tle;
> + Var *var;
> +
> + tle = get_sortgroupclause_tle(sgc, root->parse->targetList);
> + var = (Var *) tle->expr;
> +
> + if (!IsA(var, Var))
> + continue;
> + [...]
>
> if var isn't a Var, it needs to be added in new_groupby.
>
>
> Yes you're right, well, at least I've written enough code to ensure that
> it's not needed.

Oh yes, I realize that now.

> Technically we could look inside non-Vars and check if the Expr is just
> made up of Vars from a single relation, and then we could also make that
> surplus if we find other Vars which make up the table's primary key. I
> didn't make these changes as I thought it was a less likely scenario. It
> wouldn't be too much extra code to add however. I've went and added an
> XXX comment to explain that there might be future optimisation
> opportunities in the future around this.
>

Agreed.

> I've attached an updated patch.
>

+ /* don't try anything unless there's two Vars */
+ if (varlist == NULL || list_length(varlist) < 2)
+ continue;

To be perfectly correct, the comment should say "at least two Vars".

Except the small remaining typos, this patch looks very fine to me. I
flag it as ready for committer.

--
Julien Rouhaud
http://dalibo.com - http://dalibo.org

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2016-01-14 12:09:25 Re: Close handle leak in SSPI auth
Previous Message Marco Atzeri 2016-01-14 11:01:48 Re: Removing service-related code in pg_ctl for Cygwin