Re: numbering plan nodes

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: numbering plan nodes
Date: 2015-09-17 18:44:00
Message-ID: CA+Tgmoa7NwARTAtm=7vVthp=zNbY5m-jBNttfvu_qcm1Hg3nZA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks much for the quick reply.

On Thu, Sep 17, 2015 at 2:23 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> It would be nice if we could assign the integer IDs with no gaps, and
>> with the IDs of a node's children immediately following that of their
>> parent.
>
> I do not think this should be a goal, either explicitly or implicitly.
> For one thing, it will be very much in the eye of individual beholders
> which nodes are children-that-should-be-consecutive. What about
> initplans, subplans, grandchild nodes? Different use cases could easily
> have different opinions about what to do with those.

I was aiming for: the IDs of the descendents of any given node are
consecutive and have IDs which follow that's node ID, in any old order
you like.

>> The advantage of this is that if you want to have a data
>> structure for every node in the tree passed to some worker - like a
>> struct Instrumentation in dynamic shared memory - you can just create
>> an array and index it by (node ID) - (node ID of uppermost child
>> passed to worker), and every slot will be in use, so no memory is
>> wasted and no lookup table is needed.
>
> I think you could just waste the memory, or more likely you could cope
> with one level of indirection so that you only waste a pointer per
> node (ie, array of pointers indexed by node ID, only fill the ones that
> are relevant).

Hmm, I guess that wouldn't be too bad. A straight join between N
relations will have N base relations and N - 1 joins and maybe a few
sorts or similar. You could get a plan tree with a lot more nodes if
your query involves inheritance hierarchies - e.g. join 4 tables each
of which has 1000 children. But even in that case you're only talking
about 32kB of memory for the indirect table, and that's not going to
be very high on your list of problems in that case.

>> My main concern with this design is how future-proof it is. I know
>> Tom is working on ripping the planner apart and putting it back
>> together again to allow the upper levels of the planner to use paths,
>> and the discussion of the SS_finalize_plan() changes made some
>> reference to possibly wanting to mess with set_plan_references().
>
> I can't imagine I'd be doing anything that would break the simple case
> of "give every node a distinct ID". If you are building in weird
> assumptions about traversal order, that might indeed be a problem.

Good to know, thanks. With the design you proposal above, we can make
this insensitive to traversal order. The only thing that would cause
trouble is if you somehow ended up with massive gaps in the numbering
sequence - e.g. if the final plan had 6 nodes, but the IDs were all 5
digit numbers, you'd waste a silly amount of memory relative to the
size of the plan. But a few gaps (e.g. for removed SubqueryScan
nodes) won't be a problem.

Thanks,

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2015-09-17 18:47:14 Re: [patch] Proposal for \rotate in psql
Previous Message Tom Lane 2015-09-17 18:23:52 Re: numbering plan nodes