WIP patch: EvalPlanQual rewrite

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: WIP patch: EvalPlanQual rewrite
Date: 2009-10-24 23:16:30
Message-ID: 15178.1256426190@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

The attached WIP patch revises EvalPlanQual processing along the lines
I've been muttering about for the last little while. It's a fairly
large patch, but it should be a lot more efficient than the existing
code as well as delivering much saner behavior. The key points are:

1. In order to avoid pushing even more planner/executor-only fields into
RowMarkClause, I created a separate node type PlanRowMark that is used
during planning and in completed plan trees. This returns RowMarkClause
to where it was in 8.3, and insulates stored rules from future changes
in this area. This change accounts for a pretty large fraction of the
textual delta, but seemed worth doing to avoid future forced initdbs.

2. PlanRowMark nodes are now created for every scan relation in an UPDATE,
DELETE, or SELECT FOR UPDATE/SHARE query, even if the relation is not
called out as an update or locking target. This is needed to carry the
information about the resjunk output columns that identify the current
scan row for such relations. For regular-table scan relations, we just
include the TID as the ID information, so the incremental cost is minimal.
For non-table relations (such as VALUES or function scans), we include a
whole copy of the current row using a whole-row Var. This is a bit
expensive but the case seems uncommon enough to not be worth trying to
be cute about.

3. The original implementation of EvalPlanQual could not re-use the test
plan tree --- it had to go through a full executor init and shutdown for
every row to be tested. To avoid this overhead, I've associated a special
runtime Param with each LockRows or ModifyTable plan node, and arranged to
make every scan node below such a node depend on that Param. Thus, by
signaling a change in that Param, the EPQ machinery can just rescan the
already-built test plan. This addition accounts for the remaining changes
in the planner.

4. We push the current row values for *all* scan relations into the EPQ
machinery, so that any join is operating over at most one row per
relation. This makes the test query far cheaper than it often is under
the old implementation, and also solves the problem of duplicated outputs.
(To ensure no duplicated outputs in SELECT FOR UPDATE, I also added a
prohibition on SRFs in the targetlist, as mentioned earlier.)

5. I rearranged the interface between execScan.c and the plan node types
that use it, so as to get most of the knowledge about EPQ rechecking out
of the per-node-type code and into one place. There is some
plan-node-type specific logic needed for EPQ, namely we have to be able to
check the indexqual conditions against the substitute tuple when using
an indexscan or bitmap indexscan node. I put that into a new callback
function that parallels the AccessMtd callback function.

This isn't quite ready to commit yet --- there are a couple minor loose
ends to fix, and it needs a full re-read --- but I think it's close enough
to put out for comment.

regards, tom lane

Attachment Content-Type Size
evalplanqual-rewrite-1.patch.gz application/octet-stream 39.0 KB

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2009-10-24 23:38:38 Re: Parsing config files in a directory
Previous Message João Eugenio Marynowski 2009-10-24 22:17:19 Re: table corrupted