Re: executor relation handling

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: executor relation handling
Date: 2018-10-05 02:06:29
Message-ID: d098fd6f-7f23-1c38-ccb8-d4eb49d32972@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2018/10/05 5:59, Tom Lane wrote:
> Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> writes:
>> I've rebased the remaining patches. I broke down one of the patches into
>> 2 and re-ordered the patches as follows:
>
>> 0001: introduces a function that opens range table relations and maintains
>> them in an array indexes by RT index
>
>> 0002: introduces a new field in EState that's an array of RangeTblEntry
>> pointers and revises macros used in the executor that access RTEs to
>> return them from the array (David Rowley co-authored this one)
>
> I've pushed 0001 and 0002 with mostly cosmetic changes.

Thanks a lot.

> One thing I
> wanted to point out explicitly, though, is that I found this bit of 0002
> to be a seriously bad idea:
>
> --- a/src/include/nodes/execnodes.h
> +++ b/src/include/nodes/execnodes.h
> @@ -20,6 +20,7 @@
> #include "executor/instrument.h"
> #include "lib/pairingheap.h"
> #include "nodes/params.h"
> +#include "nodes/parsenodes.h"
> #include "nodes/plannodes.h"
> #include "utils/hsearch.h"
> #include "utils/queryenvironment.h"
>
> Please do not add #includes of fundamental headers to other fundamental
> headers without clearing it with somebody. There's little enough
> structure to our header collection now. I don't want to end up in a
> situation where effectively the entire header set gets pulled into
> every .c file, or worse that we have actual reference loops in the
> headers. (This is not an academic problem; somebody actually created
> such a loop awhile back. Cleaning it up, by the time we'd recognized
> the problem, was really painful.)

Okay, sorry about that. I was slightly nervous that I had to do it when
doing it, but forgot to mention that explicitly in the commit message or
the email.

>> 0003: moves result relation and ExecRowMark initialization out of InitPlan
>> and into ExecInit* routines of respective nodes
>
> I am finding myself pretty unconvinced by this one; it seems like mostly
> a random reallocation of responsibility with little advantage. The
> particular thing that brought me to a screeching halt was seeing that
> the patch removed ExecFindRowMark, despite the fact that that's part
> of our advertised FDW API (see fdwhandler.sgml), and it didn't provide
> any alternative way for an FDW to find out at runtime whether it's
> subject to a row locking requirement.
>
> I thought for a minute about just leaving the function in place, but
> that wouldn't work because both nodeLockRows and nodeModifyTable are
> written so that they find^H^H^Hbuild their rowmarks only after recursing
> to initialize their child plan nodes; so a child node that tried to use
> ExecFindRowMark during ExecInitNode would get the wrong answer. Of
> course, we could consider changing the order of operations during
> initialization of those node types, but I'm not really seeing a compelling
> reason why we should whack things around that much.
>
> So I'm inclined to just omit 0003. AFAICS this would only mean that
> we couldn't drop the global PlanRowMarks list from PlannedStmt, which
> does not bother me much.

To be honest, I too had begun to fail to see the point of this patch since
yesterday. In fact, getting this one to pass make check-world took a bit
of head-scratching due to its interaction with EvalPlanQuals EState
building, so I was almost to drop it from the series. But I felt that it
might still be a good idea to get rid of what was described as special
case code. Reading your argument against it based on how it messes up
some of the assumptions regarding ExecRowMark design, I'm willing to let
go of this one as more or less a cosmetic improvement.

>> 0005: teaches planner to remove PlanRowMarks corresponding to dummy relations
>
> I'm not entirely sold on the value of that either?

If you look at the first email on this thread, you can see that trimming
the PlanRowMarks list down to just the ones needed during execution
results in non-trivial improvement in SELECT FOR SHARE on partitioned
tables with large number of partitions:

<quote>
Speedup is more pronounced with a benchmark that needs RowMarks, because
one of the patches (0003) removes overhead around handling them.

$ cat /tmp/select-lt-for-share.sql
select * from lt where b = 999 for share;

master

tps = 94.095985 (excluding connections establishing)
tps = 93.955702 (excluding connections establishing)

<snip>

patch 0003 (prune PlanRowMarks)

tps = 712.544029 (excluding connections establishing)
tps = 717.540052 (excluding connections establishing)
</quote>

But on reflection, this seems more like adding special case code to the
planner just for this, rather than solving the more general problem of
initializing *any* partition information after pruning, being discussed
over at "speeding up planning with partitions" thread [1]. So, I don't
mind dropping this one too.

So, that leaves us with only the patch that revises the plan node fields
in light of having relieved the executor of doing any locking of its own
in *some* cases. That patch's premise is that we don't need the fields
that the patch removes because they're only referenced for the locking
purposes. But, if those plan nodes support parallel execution and hence
will be passed to parallel workers for execution who will need to lock the
tables contained in the plan nodes, then they better contain all the
information needed for locking *every* affected tables, so we had better
not removed it. Plan nodes in question are Append, MergeAppend, and
ModifyTable, of which only the first two support parallel execution, but
maybe we should just leave all of them alone for now. IOW, I'm not sure
about that patch either. Thoughts?

Thanks,
Amit

[1]
https://www.postgresql.org/message-id/9d7c5112-cb99-6a47-d3be-cf1ee6862a1d@lab.ntt.co.jp

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-10-05 02:12:04 Re: Odd 9.4, 9.3 buildfarm failure on s390x
Previous Message Pavel Stehule 2018-10-05 01:34:14 Re: [HACKERS] proposal: schema variables