Re: RTE_NAMEDTUPLESTORE, enrtuples and comments

From: Noah Misch <noah(at)leadboat(dot)com>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: Kevin Grittner <kgrittn(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Subject: Re: RTE_NAMEDTUPLESTORE, enrtuples and comments
Date: 2017-06-13 06:40:12
Message-ID: 20170613064012.GC1664286@rfd.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jun 12, 2017 at 04:04:23PM +1200, Thomas Munro wrote:
> On Sun, Jun 11, 2017 at 11:11 PM, Thomas Munro
> <thomas(dot)munro(at)enterprisedb(dot)com> wrote:
> > On Sun, Jun 11, 2017 at 6:25 PM, Noah Misch <noah(at)leadboat(dot)com> wrote:
> >> This fourth point is not necessarily a defect: I wonder if RangeTblEntry is
> >> the right place for enrtuples. It's a concept regularly seen in planner data
> >> structures but not otherwise seen at parse tree level.
> >
> > I agree that this is strange. Perhaps
> > set_namedtuplestore_size_estimates should instead look up the
> > EphemeralNamedRelation by rte->enrname to find its way to
> > enr->md.enrtuples, but I'm not sure off the top of my head how it
> > should get its hands on the QueryEnvironment required to do that. I
> > will look into this on Monday, but other ideas/clues welcome...
>
> Here's some background: If you look at the interface changes
> introduced by 18ce3a4 you will see that it is now possible for a
> QueryEnvironment object to be injected into the parser and executor.
> Currently the only use for it is to inject named tuplestores into the
> system via SPI_register_relation or SPI_register_trigger_data. That's
> to support SQL standard transition tables, but anyone can now use SPI
> to expose data to SQL via an ephemeral named relation in this way.
> (In future we could imagine other kinds of objects like table
> variables, anonymous functions or streams).
>
> The QueryEnvironment is used by the parser to resolve names and build
> RangeTblEntry objects. The planner doesn't currently need it, because
> all the information it needs is in the RangeTblEntry, including the
> offending row estimate. Then the executor needs it to get its hands
> on the tuplestores. So the question is: how can we get it into
> costsize.c, so that it can look up the EphermalNamedRelationMetaData
> object by name, instead of trafficking statistics through parser data
> structures?
>
> Here are a couple of ways forward that I can see:
>
> 1. Figure out how to get the QueryEnvironment through more of these
> stack frames (possibly inside other objects), so that
> set_namedtuplestore_size_estimates can look up enrtuples by enrname:
>
> set_namedtuplestore_size_estimates <-- would need QueryEnvironment
> set_namedtuplestore_pathlist
> set_rel_size
> set_base_rel_sizes
> make_one_rel
> query_planner
> grouping_planner
> subquery_planner
> standard_planner
> planner
> pg_plan_query
> pg_plan_queries <-- doesn't receive QueryEnvironment
> BuildCachedPlan <-- receives QueryEnvironment
> GetCachedPlan
> _SPI_execute_plan
> SPI_execute_plan_with_paramlist
>
> 2. Rip the row estimation out for now, use a bogus hard coded
> estimate like we do in some other cases, and revisit later. See
> attached (including changes from my previous message).
> Unsurprisingly, a query plan changes.
>
> Thoughts?

I'm not sufficiently familiar with the relevant code to judge this one. Let's
see if planner experts voice an opinion. Absent more opinions, the current
design stands.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dean Rasheed 2017-06-13 06:57:30 Re: PG10 Partitioned tables and relation_is_updatable()
Previous Message Noah Misch 2017-06-13 06:33:40 Re: Get stuck when dropping a subscription during synchronizing table