Re: GSOC13 proposal - extend RETURNING syntax

From: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
To: Karol Trzcionka <karlikt(at)gmail(dot)com>
Cc: David Fetter <david(at)fetter(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)2ndquadrant(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: GSOC13 proposal - extend RETURNING syntax
Date: 2013-08-21 18:34:58
Message-ID: 52150852.6010101@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2013-08-21 20:00 keltezéssel, Boszormenyi Zoltan írta:
> 2013-08-21 19:17 keltezéssel, Boszormenyi Zoltan írta:
>> Hi,
>>
>> 2013-08-20 21:06 keltezéssel, Karol Trzcionka írta:
>>> W dniu 20.08.2013 20:55, Boszormenyi Zoltan pisze:
>>>> Here's a new one, for v7:
>>>>
>>>> setrefs.c: In function ‘set_plan_refs’:
>>>> setrefs.c:2001:26: warning: ‘before_index’ may be used uninitialized in this function
>>>> [-Wmaybe-uninitialized]
>>>> bind_returning_variables(rlist, before_index, after_index);
>>>> ^
>>>> setrefs.c:1957:21: note: ‘before_index’ was declared here
>>>> int after_index=0, before_index;
>>>> ^
>>> Right, my mistake. Sorry and thanks. Fixed.
>>> Regards,
>>> Karol Trzcionka
>>
>> With this fixed, a more complete review:
>>
>> * Is the patch in a patch format which has context? (eg: context diff format)
>>
>> Yes.
>>
>> * Does it apply cleanly to the current git master?
>>
>> Yes.
>>
>> * Does it include reasonable tests, necessary doc patches, etc?
>>
>> There is a new regression test (returning_before_after.sql) covering
>> this feature. However, I think it should be added to the group
>> where "returning.sql" resides currently. There is a value in running it
>> in parallel to other tests. Sometimes a subtle bug is uncovered
>> because of this and your v2 patch fixed such a bug already.
>>
>> doc/src/sgml/ref/update.sgml describes this feature.
>>
>> doc/src/sgml/dml.sgml should also be touched to cover this feature.
>>
>> * Does the patch actually implement what it's supposed to do?
>>
>> Yes.
>>
>> * Do we want that?
>>
>> Yes.
>>
>> * Do we already have it?
>>
>> No.
>>
>> * Does it follow SQL spec, or the community-agreed behavior?
>>
>> RETURNING is a PostgreSQL extension, so the SQL-spec part
>> of the question isn't applicable.
>>
>> It implements the community-agreed behaviour, according to
>> the new regression test coverage.
>>
>> * Does it include pg_dump support (if applicable)?
>>
>> n/a
>>
>> * Are there dangers?
>>
>> I don't think so.
>>
>> * Have all the bases been covered?
>>
>> It seems so. I have also tried mixing before/after columns in
>> different orders and the query didn't fail:
>>
>> zozo=# create table t1 (id serial primary key, i1 int4, i2 int4, t1 text, t2 text);
>> CREATE TABLE
>> zozo=# insert into t1 (i1, i2, t1, t2) values (1, 1, 'a', 'a');
>> INSERT 0 1
>> zozo=# insert into t1 (i1, i2, t1, t2) values (2, 2, 'b', 'b');
>> INSERT 0 1
>> zozo=# insert into t1 (i1, i2, t1, t2) values (3, 3, 'c', 'c');
>> INSERT 0 1
>> zozo=# select * from t1;
>> id | i1 | i2 | t1 | t2
>> ----+----+----+----+----
>> 1 | 1 | 1 | a | a
>> 2 | 2 | 2 | b | b
>> 3 | 3 | 3 | c | c
>> (3 rows)
>>
>> zozo=# begin;
>> BEGIN
>> zozo=# update t1 set i2 = i2*2, t2 = t2 || 'x2' where id = 2 returning before.i1,
>> after.i1, before.i2, after.i2, before.t1, after.t1, before.t2, after.t2;
>> i1 | i1 | i2 | i2 | t1 | t1 | t2 | t2
>> ----+----+----+----+----+----+----+-----
>> 2 | 2 | 2 | 4 | b | b | b | bx2
>> (1 row)
>>
>> UPDATE 1
>> zozo=# update t1 set i1 = i1 * 3, i2 = i2*2, t1 = t1 || 'x3', t2 = t2 || 'x2' where id
>> = 3 returning before.i1, before.i2, after.i1, after.i2, before.t1, before.t2, after.t1,
>> after.t2; i1 | i2 | i1 | i2 | t1 | t2 | t1 | t2
>> ----+----+----+----+----+----+-----+-----
>> 3 | 3 | 9 | 6 | c | c | cx3 | cx2
>> (1 row)
>>
>> UPDATE 1
>>
>>
>>
>> * Does the feature work as advertised?
>>
>> Yes.
>>
>> * Are there corner cases the author has failed to consider?
>>
>> I don't know.
>>
>> * Are there any assertion failures or crashes?
>>
>> No.
>>
>> * Does the patch slow down simple tests?
>>
>> No.
>>
>> * If it claims to improve performance, does it?
>>
>> n/a
>>
>> * Does it slow down other things?
>>
>> No.
>>
>> * Does it follow the project coding guidelines?
>>
>> Mostly.
>>
>> In the src/test/regress/parallel_schedule contains an extra
>> new line at the end, it shouldn't.
>>
>> In b/src/backend/optimizer/plan/setrefs.c:
>>
>> +static void bind_returning_variables(List *rlist, int bef, int aft);
>>
>> but later it becomes not public:
>>
>> + */
>> +void bind_returning_variables(List *rlist, int bef, int aft)
>> +{
>>
>> Strange, but GCC 4.8.1 -Wall doesn't catch it. But the forward
>> declaration is not needed, the function is called only from
>> later functions.
>>
>> Similar for parse_clause.c:
>>
>> +extern void addAliases(ParseState *pstate);
>>
>> +void addAliases(ParseState *pstate)
>>
>> This external declaration is not needed since it is already
>> in src/include/parser/parse_clause.h
>>
>> In setrefs.c, bind_returning_variables() I would also rename
>> the function arguments, so "before" and "after" are spelled out.
>> These are not C keywords.
>>
>> Some assignments, like:
>>
>> + var=(Var*)tle;
>> and
>> + int index_rel=1;
>>
>> in setrefs.c need spaces.
>>
>> "if()" statements need a space before the "(" and not after.
>>
>> Add spaces in the {} list in addAliases():
>> + char *aliases[] = {"before","after"};
>> like this: { "before", "after" }
>>
>> Spaces are needed here, too:
>> + for(i=0 ; i<noal; i++)
>>
>> This "noal" should be "naliases" or "n_aliases" if you really want
>> a variable. I would simply use the constant "2" for the two for()
>> loops in addAliases() instead, its purpose is obvious enough.
>>
>> In setrefs.c, bind_returning_variables():
>> + Var *var = NULL;
>> + foreach(temp, rlist){
>> Add an empty line after the declaration block.
>>
>>
>> * Are there portability issues?
>>
>> No.
>>
>> * Will it work on Windows/BSD etc?
>>
>> Yes.
>>
>> * Are the comments sufficient and accurate?
>
> There should be more comments, especially regarding
> my question at the end.
>
>>
>>
>>
>> * Does it do what it says, correctly?
>>
>> Yes.
>>
>> * Does it produce compiler warnings?
>>
>> No.
>>
>> * Can you make it crash?
>>
>> No.
>>
>> * Is everything done in a way that fits together coherently with other features/modules?
>>
>> I think so, mostly. Comments below.
>>
>> * Are there interdependencies that can cause problems?
>>
>> I don't think so.
>>
>> Other comments:
>>
>> This #define in pg_class:
>>
>> diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
>> index 49c4f6f..1b09994 100644
>> --- a/src/include/catalog/pg_class.h
>> +++ b/src/include/catalog/pg_class.h
>> @@ -154,6 +154,7 @@ DESCR("");
>> #define RELKIND_COMPOSITE_TYPE 'c' /* composite type */
>> #define RELKIND_FOREIGN_TABLE 'f' /* foreign table */
>> #define RELKIND_MATVIEW 'm' /* materialized view */
>> +#define RELKIND_BEFORE 'b' /* virtual table for
>> before/after statements */
>>
>> #define RELPERSISTENCE_PERMANENT 'p' /* regular table */
>> #define RELPERSISTENCE_UNLOGGED 'u' /* unlogged permanent
>> table */
>>
>> The "RELKIND_*" values all show up in the pg_class table except
>> this new one. I don't think pg_class.h should be modified at all.
>> addAliases() should use RELKIND_RELATION together with
>> RTE_BEFORE. Then checks like:
>>
>> + if (rte->relkind == RELKIND_BEFORE)
>> + continue;
>>
>> should become
>>
>> + if (rte->relkind == RELKIND_RELATION && rte->rtekind == RTE_BEFORE)
>> + continue;
>
> Thinking about it more,
> if (rte->rtekind == RTE_BEFORE)
> would be enough, as no other kinds of rte's can have rtekind == RTE_BEFORE.
>
>>
>> Speaking of which, RTE_BEFORE is more properly named
>> RTE_RETURNING_ALIAS or something like that because it
>> covers both "before" and "after". Someone may have a better
>> idea for naming this symbol.
>>
>> I feel like I understand what the code does and it looks sane to me.
>>
>> One question, though, about this part:
>>
>> ----------------------------------------
>> @@ -1900,7 +1954,27 @@ set_returning_clause_references(PlannerInfo *root,
>> int rtoffset)
>> {
>> indexed_tlist *itlist;
>> + int after_index=0, before_index=0;
>> + Query *parse = root->parse;
>>
>> + ListCell *rt;
>> + RangeTblEntry *bef;
>> +
>> + int index_rel=1;
>> +
>> + foreach(rt,parse->rtable)
>> + {
>> + bef = (RangeTblEntry *)lfirst(rt);
>> + if(strcmp(bef->eref->aliasname,"after") == 0 && bef->rtekind ==
>> RTE_BEFORE )
>> + {
>> + after_index = index_rel;
>> + }
>> + if(strcmp(bef->eref->aliasname,"before") == 0 && bef->rtekind ==
>> RTE_BEFORE )
>> + {
>> + before_index = index_rel;
>> + }
>> + index_rel++;
>> + }
>> /*
>> * We can perform the desired Var fixup by abusing the fix_join_expr
>> * machinery that formerly handled inner indexscan fixup. We search the
>> @@ -1924,6 +1998,7 @@ set_returning_clause_references(PlannerInfo *root,
>> resultRelation,
>> rtoffset);
>>
>> + bind_returning_variables(rlist, before_index, after_index);
>> pfree(itlist);
>>
>> return rlist;
>> ----------------------------------------
>>
>> Why is it enough to keep the last before_index and after_index values?
>> What if there are more than one matching RangeTblEntry for "before"
>> and/or for "after"? Is it an error condition or of them should be fixed?

Since addAliases() only adds a single one for each and only for an
UPDATE ... RETURNING query, it is okay. Also, because
set_returning_clause_references() is called separately for each
query with RETURNING. Anyway, a comment before the new
foreach() loop in set_returning_clause_references() should explain
the fact that only one of each ("before" and "after") can occur for
such a query.

I have just tried this:

before as (update t1 set i1 = i1 * 2, i2 = i2 * 3, t1 = t1 || 'x2', t2 = t2 || 'x3'
where id = 1 returning before.i1, after.i1, before.i2, after.i2),
after as (update t1 set i1 = i1 * 2, i2 = i2 * 3, t1 = t1 || 'x2', t2 = t2 || 'x3'
where id = 2 returning before.i1, after.i1, before.i2, after.i2)
select * from (select * from before union all select * from after) as x;

and it gave me the proper results, no crash.

About addAliases():
- it can be moved to parser/analyze.c so it can be static.
- addReturningAliases() may be a better name for the function.

>>
>> I think that's all for now.
>>
>> Best regards,
>> Zoltán Böszörményi
>>
>> --
>> ----------------------------------
>> Zoltán Böszörményi
>> Cybertec Schönig & Schönig GmbH
>> Gröhrmühlgasse 26
>> A-2700 Wiener Neustadt, Austria
>> Web:http://www.postgresql-support.de
>> http://www.postgresql.at/
>
>
> --
> ----------------------------------
> Zoltán Böszörményi
> Cybertec Schönig & Schönig GmbH
> Gröhrmühlgasse 26
> A-2700 Wiener Neustadt, Austria
> Web:http://www.postgresql-support.de
> http://www.postgresql.at/

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Karol Trzcionka 2013-08-21 18:52:25 Re: GSOC13 proposal - extend RETURNING syntax
Previous Message Boszormenyi Zoltan 2013-08-21 18:00:50 Re: GSOC13 proposal - extend RETURNING syntax