Re: RTE_NAMEDTUPLESTORE, enrtuples and comments

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Kevin Grittner <kgrittn(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Subject: Re: RTE_NAMEDTUPLESTORE, enrtuples and comments
Date: 2017-06-12 04:04:23
Message-ID: CAEepm=0Y6RqjjE6WMOx9QN5qp2i3jN2kBtW3Y-jFNxiVNNhSgw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Jun 11, 2017 at 11:11 PM, Thomas Munro
<thomas(dot)munro(at)enterprisedb(dot)com> wrote:
> On Sun, Jun 11, 2017 at 6:25 PM, Noah Misch <noah(at)leadboat(dot)com> wrote:
>> This fourth point is not necessarily a defect: I wonder if RangeTblEntry is
>> the right place for enrtuples. It's a concept regularly seen in planner data
>> structures but not otherwise seen at parse tree level.
>
> I agree that this is strange. Perhaps
> set_namedtuplestore_size_estimates should instead look up the
> EphemeralNamedRelation by rte->enrname to find its way to
> enr->md.enrtuples, but I'm not sure off the top of my head how it
> should get its hands on the QueryEnvironment required to do that. I
> will look into this on Monday, but other ideas/clues welcome...

Here's some background: If you look at the interface changes
introduced by 18ce3a4 you will see that it is now possible for a
QueryEnvironment object to be injected into the parser and executor.
Currently the only use for it is to inject named tuplestores into the
system via SPI_register_relation or SPI_register_trigger_data. That's
to support SQL standard transition tables, but anyone can now use SPI
to expose data to SQL via an ephemeral named relation in this way.
(In future we could imagine other kinds of objects like table
variables, anonymous functions or streams).

The QueryEnvironment is used by the parser to resolve names and build
RangeTblEntry objects. The planner doesn't currently need it, because
all the information it needs is in the RangeTblEntry, including the
offending row estimate. Then the executor needs it to get its hands
on the tuplestores. So the question is: how can we get it into
costsize.c, so that it can look up the EphermalNamedRelationMetaData
object by name, instead of trafficking statistics through parser data
structures?

Here are a couple of ways forward that I can see:

1. Figure out how to get the QueryEnvironment through more of these
stack frames (possibly inside other objects), so that
set_namedtuplestore_size_estimates can look up enrtuples by enrname:

set_namedtuplestore_size_estimates <-- would need QueryEnvironment
set_namedtuplestore_pathlist
set_rel_size
set_base_rel_sizes
make_one_rel
query_planner
grouping_planner
subquery_planner
standard_planner
planner
pg_plan_query
pg_plan_queries <-- doesn't receive QueryEnvironment
BuildCachedPlan <-- receives QueryEnvironment
GetCachedPlan
_SPI_execute_plan
SPI_execute_plan_with_paramlist

2. Rip the row estimation out for now, use a bogus hard coded
estimate like we do in some other cases, and revisit later. See
attached (including changes from my previous message).
Unsurprisingly, a query plan changes.

Thoughts?

--
Thomas Munro
http://www.enterprisedb.com

Attachment Content-Type Size
fixes-for-enr-rte-review-v2.patch application/octet-stream 5.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2017-06-12 04:09:25 Re: Adding support for Default partition in partitioning
Previous Message Michael Paquier 2017-06-12 03:17:08 Re: Re: BUG #14680: startup process on standby encounter a deadlock of TwoPhaseStateLock when redo 2PC xlog