Re: GSOC13 proposal - extend RETURNING syntax

From: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
To: Karol Trzcionka <karlikt(at)gmail(dot)com>
Cc: David Fetter <david(at)fetter(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)2ndquadrant(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: GSOC13 proposal - extend RETURNING syntax
Date: 2013-08-21 17:17:51
Message-ID: 5214F63F.5050608@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

2013-08-20 21:06 keltezéssel, Karol Trzcionka írta:
> W dniu 20.08.2013 20:55, Boszormenyi Zoltan pisze:
>> Here's a new one, for v7:
>>
>> setrefs.c: In function ‘set_plan_refs’:
>> setrefs.c:2001:26: warning: ‘before_index’ may be used uninitialized in this function
>> [-Wmaybe-uninitialized]
>> bind_returning_variables(rlist, before_index, after_index);
>> ^
>> setrefs.c:1957:21: note: ‘before_index’ was declared here
>> int after_index=0, before_index;
>> ^
> Right, my mistake. Sorry and thanks. Fixed.
> Regards,
> Karol Trzcionka

With this fixed, a more complete review:

* Is the patch in a patch format which has context? (eg: context diff format)

Yes.

* Does it apply cleanly to the current git master?

Yes.

* Does it include reasonable tests, necessary doc patches, etc?

There is a new regression test (returning_before_after.sql) covering
this feature. However, I think it should be added to the group
where "returning.sql" resides currently. There is a value in running it
in parallel to other tests. Sometimes a subtle bug is uncovered
because of this and your v2 patch fixed such a bug already.

doc/src/sgml/ref/update.sgml describes this feature.

doc/src/sgml/dml.sgml should also be touched to cover this feature.

* Does the patch actually implement what it's supposed to do?

Yes.

* Do we want that?

Yes.

* Do we already have it?

No.

* Does it follow SQL spec, or the community-agreed behavior?

RETURNING is a PostgreSQL extension, so the SQL-spec part
of the question isn't applicable.

It implements the community-agreed behaviour, according to
the new regression test coverage.

* Does it include pg_dump support (if applicable)?

n/a

* Are there dangers?

I don't think so.

* Have all the bases been covered?

It seems so. I have also tried mixing before/after columns in
different orders and the query didn't fail:

zozo=# create table t1 (id serial primary key, i1 int4, i2 int4, t1 text, t2 text);
CREATE TABLE
zozo=# insert into t1 (i1, i2, t1, t2) values (1, 1, 'a', 'a');
INSERT 0 1
zozo=# insert into t1 (i1, i2, t1, t2) values (2, 2, 'b', 'b');
INSERT 0 1
zozo=# insert into t1 (i1, i2, t1, t2) values (3, 3, 'c', 'c');
INSERT 0 1
zozo=# select * from t1;
id | i1 | i2 | t1 | t2
----+----+----+----+----
1 | 1 | 1 | a | a
2 | 2 | 2 | b | b
3 | 3 | 3 | c | c
(3 rows)

zozo=# begin;
BEGIN
zozo=# update t1 set i2 = i2*2, t2 = t2 || 'x2' where id = 2 returning before.i1,
after.i1, before.i2, after.i2, before.t1, after.t1, before.t2, after.t2;
i1 | i1 | i2 | i2 | t1 | t1 | t2 | t2
----+----+----+----+----+----+----+-----
2 | 2 | 2 | 4 | b | b | b | bx2
(1 row)

UPDATE 1
zozo=# update t1 set i1 = i1 * 3, i2 = i2*2, t1 = t1 || 'x3', t2 = t2 || 'x2' where id = 3
returning before.i1, before.i2, after.i1, after.i2, before.t1, before.t2, after.t1,
after.t2; i1 | i2 | i1 | i2 | t1 | t2 | t1 | t2
----+----+----+----+----+----+-----+-----
3 | 3 | 9 | 6 | c | c | cx3 | cx2
(1 row)

UPDATE 1

* Does the feature work as advertised?

Yes.

* Are there corner cases the author has failed to consider?

I don't know.

* Are there any assertion failures or crashes?

No.

* Does the patch slow down simple tests?

No.

* If it claims to improve performance, does it?

n/a

* Does it slow down other things?

No.

* Does it follow the project coding guidelines?

Mostly.

In the src/test/regress/parallel_schedule contains an extra
new line at the end, it shouldn't.

In b/src/backend/optimizer/plan/setrefs.c:

+static void bind_returning_variables(List *rlist, int bef, int aft);

but later it becomes not public:

+ */
+void bind_returning_variables(List *rlist, int bef, int aft)
+{

Strange, but GCC 4.8.1 -Wall doesn't catch it. But the forward
declaration is not needed, the function is called only from
later functions.

Similar for parse_clause.c:

+extern void addAliases(ParseState *pstate);

+void addAliases(ParseState *pstate)

This external declaration is not needed since it is already
in src/include/parser/parse_clause.h

In setrefs.c, bind_returning_variables() I would also rename
the function arguments, so "before" and "after" are spelled out.
These are not C keywords.

Some assignments, like:

+ var=(Var*)tle;
and
+ int index_rel=1;

in setrefs.c need spaces.

"if()" statements need a space before the "(" and not after.

Add spaces in the {} list in addAliases():
+ char *aliases[] = {"before","after"};
like this: { "before", "after" }

Spaces are needed here, too:
+ for(i=0 ; i<noal; i++)

This "noal" should be "naliases" or "n_aliases" if you really want
a variable. I would simply use the constant "2" for the two for()
loops in addAliases() instead, its purpose is obvious enough.

In setrefs.c, bind_returning_variables():
+ Var *var = NULL;
+ foreach(temp, rlist){
Add an empty line after the declaration block.

* Are there portability issues?

No.

* Will it work on Windows/BSD etc?

Yes.

* Are the comments sufficient and accurate?

* Does it do what it says, correctly?

Yes.

* Does it produce compiler warnings?

No.

* Can you make it crash?

No.

* Is everything done in a way that fits together coherently with other features/modules?

I think so, mostly. Comments below.

* Are there interdependencies that can cause problems?

I don't think so.

Other comments:

This #define in pg_class:

diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index 49c4f6f..1b09994 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -154,6 +154,7 @@ DESCR("");
#define RELKIND_COMPOSITE_TYPE 'c' /* composite type */
#define RELKIND_FOREIGN_TABLE 'f' /* foreign table */
#define RELKIND_MATVIEW 'm' /* materialized view */
+#define RELKIND_BEFORE 'b' /* virtual table for before/after
statements */

#define RELPERSISTENCE_PERMANENT 'p' /* regular table */
#define RELPERSISTENCE_UNLOGGED 'u' /* unlogged permanent
table */

The "RELKIND_*" values all show up in the pg_class table except
this new one. I don't think pg_class.h should be modified at all.
addAliases() should use RELKIND_RELATION together with
RTE_BEFORE. Then checks like:

+ if (rte->relkind == RELKIND_BEFORE)
+ continue;

should become

+ if (rte->relkind == RELKIND_RELATION && rte->rtekind == RTE_BEFORE)
+ continue;

Speaking of which, RTE_BEFORE is more properly named
RTE_RETURNING_ALIAS or something like that because it
covers both "before" and "after". Someone may have a better
idea for naming this symbol.

I feel like I understand what the code does and it looks sane to me.

One question, though, about this part:

----------------------------------------
@@ -1900,7 +1954,27 @@ set_returning_clause_references(PlannerInfo *root,
int rtoffset)
{
indexed_tlist *itlist;
+ int after_index=0, before_index=0;
+ Query *parse = root->parse;

+ ListCell *rt;
+ RangeTblEntry *bef;
+
+ int index_rel=1;
+
+ foreach(rt,parse->rtable)
+ {
+ bef = (RangeTblEntry *)lfirst(rt);
+ if(strcmp(bef->eref->aliasname,"after") == 0 && bef->rtekind == RTE_BEFORE )
+ {
+ after_index = index_rel;
+ }
+ if(strcmp(bef->eref->aliasname,"before") == 0 && bef->rtekind == RTE_BEFORE )
+ {
+ before_index = index_rel;
+ }
+ index_rel++;
+ }
/*
* We can perform the desired Var fixup by abusing the fix_join_expr
* machinery that formerly handled inner indexscan fixup. We search the
@@ -1924,6 +1998,7 @@ set_returning_clause_references(PlannerInfo *root,
resultRelation,
rtoffset);

+ bind_returning_variables(rlist, before_index, after_index);
pfree(itlist);

return rlist;
----------------------------------------

Why is it enough to keep the last before_index and after_index values?
What if there are more than one matching RangeTblEntry for "before"
and/or for "after"? Is it an error condition or of them should be fixed?

I think that's all for now.

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Merlin Moncure 2013-08-21 17:24:52 Re: PL/pgSQL, RAISE and error context
Previous Message Jeff Janes 2013-08-21 16:28:32 Re: Back-patch change in hashed DISTINCT estimation?