Re: Oddity in handling of cached plans for FDW queries

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Oddity in handling of cached plans for FDW queries
Date: 2016-07-14 12:26:28
Message-ID: CAFjFpRd9rx1kc-Dvm6ogeNKF8C6XxFA-QAH89T08pFHTKCOaFw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jul 14, 2016 at 5:10 PM, Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
wrote:

> On 2016/07/13 18:00, Ashutosh Bapat wrote:
>
>> To fix the first, I'd like to propose (1) replacing the existing
>> has_foreign_join flag in the CachedPlan data structure with a new
>> flag, say uses_user_mapping, that indicates whether a cached plan
>> uses any user mapping regardless of whether the cached plan has
>> foreign joins or not (likewise, replace the hasForeignJoin flag in
>> PlannedStmt), and (2) invalidating the cached plan if the
>> uses_user_mapping flag is true.
>>
>
> That way we will have plan cache invalidations even when simple foreign
>> tables scans (not join) are involved, which means all the plans
>> involving any reference to a foreign table with valid user mapping
>> associated with it. That can be a huge cost as compared to the current
>> solution where sub-optimal plan will be used only when a user mapping is
>> changed while a statement has been prepared. That's a rare scenario and
>> somebody can work around that by preparing the statement again.
>>
>
> I'm not sure that's a good workaround. ISTM that people often don't pay
> much attention to plan changes, so they would execute the inefficient plan
> without realizing the plan change, it would take long, they would start
> thinking what's happening there, and finally, they would find that the
> reason for that is due to the plan change. I think we should prevent such
> a trouble.
>

The case you described is other way round. When the statement was prepared
the join was not pushed down. A change in user mapping afterwards may allow
the join to be pushed down. But right now it won't be, so a user wouldn't
see any difference, right?

>
> IIRC, we
>> had discussed this situation when implementing the cache invalidation
>> logic.
>>
>
> I didn't know that. Sorry for speaking up late.
>
> But there's no workaround for your solution.
>>
>
> As you said, this is a rare scenario; in many cases, people define user
> mappings properly beforehand. So, just invalidating all relevant plans on
> the syscache invalidation events would be fine. (I thought one possible
> improvement might be to track exactly the dependencies of plans on user
> mappings and invalidate just those plans that depend on the user mapping
> being modified the same way for user-defined functions, but I'm not sure
> it's worth complicating the code.)
>

Exactly, for a rare scenario, should we be penalizing large number of plans
or just continue to use a previously prepared plan when an optimal plan has
become available because of changed condition. I would choose second over
the first as it doesn't make things worse than they are.

> I don't think the above change is sufficient to fix the second. The
>> root reason for that is that since currently, we allow the user
>> mapping OID (rel->umid) to be InvalidOid in two cases: (1) user
>> mappings mean something to the FDW but it can't get any user mapping
>> at planning time and (2) user mappings are meaningless to the FDW,
>> we cannot distinguish these two cases.
>>
>
> The way to differentiate between these two is to look at the serverid.
>> If server id is invalid it's the case 1,
>>
>
> Really? Maybe my explanation was not good, but consider a foreign join
> plan created through GetForeignJoinPaths, by an FDW to which user mappings
> are meaningless, like file_fdw. In that plan, the corresponding server id
> would be valid, not invalid. No?
>

While planning join, we invalidate user mapping id if the user mappings do
not match (see build_join_rel()). In such case, the joinrel will have
invalid user mapping (and invalid server id) even though user mapping is
associated with the joining tables. The way to differentiate between this
case and the case when an FDW doesn't need user mappings and the join is
shippable is through valid serverid (see build_join_rel()).
Non-availability of a user mapping for a table whose FDW requires user
mappings should end up in an error (in FDW code), so we shouldn't add
complexity for that case.

>
> So, I'd like to introduce a new callback routine to specify that
>> user mappings mean something to the FDW as proposed by Tom [2], and
>> use that to reject the former case, which allows us to set the above
>> uses_user_mapping flag appropriately, ie, set the flag to true only
>> if user mapping changes require forcing a replan.
>>
>
> This routine is meaningless unless the core (or FDW) does not allow a
>> user mapping to be created for such FDWs. Without that, core code would
>> get confused as to what it should do when it sees a user mapping for an
>> FDW which says user mappings are meaningless.
>>
>
> The core wouldn't care about such a user mapping for the FDW; the core
> would just ignore the user mapping. No?
>

See build_join_rel(). I would like to see, how do you change the conditions
below in that function with your proposal.
468 /*
469 * Set up foreign-join fields if outer and inner relation are
foreign
470 * tables (or joins) belonging to the same server and using the
same user
471 * mapping.
472 *
473 * Otherwise those fields are left invalid, so FDW API will not be
called
474 * for the join relation.
475 *
476 * For FDWs like file_fdw, which ignore user mapping, the user
mapping id
477 * associated with the joining relation may be invalid. A valid
serverid
478 * distinguishes between a pushed down join with no user mapping
and a
479 * join which can not be pushed down because of user mapping
mismatch.
480 */
481 if (OidIsValid(outer_rel->serverid) &&
482 inner_rel->serverid == outer_rel->serverid &&
483 inner_rel->umid == outer_rel->umid)
484 {
485 joinrel->serverid = outer_rel->serverid;
486 joinrel->umid = outer_rel->umid;
487 joinrel->fdwroutine = outer_rel->fdwroutine;
488 }

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2016-07-14 12:42:42 Re: Reviewing freeze map code
Previous Message AMatveev 2016-07-14 12:14:53 Re: One process per session lack of sharing