Re: Writeable CTE patch

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi>
Cc: Alex Hunsaker <badalex(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Writeable CTE patch
Date: 2009-11-28 18:59:14
Message-ID: 29292.1259434754@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Marko Tiikkaja <marko(dot)tiikkaja(at)cs(dot)helsinki(dot)fi> writes:
> Attached is the latest version of the patch.

I looked through this patch and concluded that it still needs a fair
amount of work, so I'm bouncing it back for further work.

1. I thought we'd agreed at
http://archives.postgresql.org/pgsql-hackers/2009-10/msg00558.php
that the patch should support WITH on DML statements, eg
with (some-query) insert into foo ...
This might not take much more than grammar additions, but it's
definitely lacking at the moment.

2. The handling of rules on DML WITH queries is far short of sufficient.
AFAICT, what it's doing is rewriting the query, then taking the first
or last element of the resulting query list as replacing the WITH
query, and adding the rest of the list after or before the main query.
This does not work at all for cases involving conditional DO INSTEAD
rules, since there could be more than one element of the resulting
query list that's responsible for delivering results depending on the
runtime outcome of the condition. I don't think it works for
unconditional DO INSTEAD either, since the rule producing output might
not be the first or last one. And in any case it fails to satisfy the
POLA in regards to the order of execution of DO ALSO queries relative
to other WITH queries or the main query.

I am not sure that it is possible to fix this without really drastic
surgery on the rule mechanisms. Or maybe we ought to rethink what
the representation of DML WITH queries is.

Perhaps it would be acceptable to just throw ERROR_FEATURE_NOT_SUPPORTED
when there are DO ALSO or conditional DO INSTEAD rules applying to the
target of a DML WITH query. I wouldn't normally think that just blowing
off such a thing meets the project's quality standards, but we all know
that the current rule mechanism is in need of a ground-up redesign anyway.
It's hard to justify putting a lot of work into making it work with DML
WITH queries when we might be throwing it all out in the future.

One thing that really does have to draw an error is that AFAIR the current
rule feature doesn't enforce that a rewritten query produce the same type
of output that the original would have. We just ship off whatever the
results are to the client, and let it sort everything out. In a DML WITH
query, though, I think we do have to insist that the rewritten query(s)
still produce the same RETURNING rowtype as before.

3. I'm pretty unimpressed with the code added to ExecutePlan. It knows
way more than it ought to about CTEs, and yet I don't think it's doing the
right things anyway --- in particular, won't it run the "leader" CTE more
than once if one CTE references another? I think it would be better if
the PlannedStmt representation just told ExecutePlan what to do, rather
than having all these heuristics inside ExecutePlan. (BTW, I also think
it would work better if you had the CommandCounterIncrement at the bottom
of the loop, after the subquery execution not before it. But I'm not sure
it's safe for ExecutePlan to be modifying the snapshot it's handed anyway.)

I wonder whether it would be practical to fix both #2 and #3 by having the
representation of DML WITH queries look more like the representation of
rule rewrite output --- that is, generate a list of top-level Queries
not one Query with DML subqueries in its CTE list. The main thing that
seems to be missing in order to allow that is for a Query to refer back to
the output of a previous Query in the list. This doesn't seem
tremendously hard at runtime --- it's just a tuplestore to keep around
--- but I'm not clear what it ought to look like in terms of the parsetree
representation.

4. As previously noted, the changes to avoid using es_result_relation_info
are broken and need to be dropped from the patch.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2009-11-28 23:47:49 Re: Application name patch - v4
Previous Message Andrew Dunstan 2009-11-28 14:35:10 Re: Initial refactoring of plperl.c [PATCH]