Re: Oddity in handling of cached plans for FDW queries

From: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
To: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Oddity in handling of cached plans for FDW queries
Date: 2016-07-14 11:40:57
Message-ID: bbd3d81d-1596-b7a6-833d-ddfc6754c82f@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2016/07/13 18:00, Ashutosh Bapat wrote:
> To fix the first, I'd like to propose (1) replacing the existing
> has_foreign_join flag in the CachedPlan data structure with a new
> flag, say uses_user_mapping, that indicates whether a cached plan
> uses any user mapping regardless of whether the cached plan has
> foreign joins or not (likewise, replace the hasForeignJoin flag in
> PlannedStmt), and (2) invalidating the cached plan if the
> uses_user_mapping flag is true.

> That way we will have plan cache invalidations even when simple foreign
> tables scans (not join) are involved, which means all the plans
> involving any reference to a foreign table with valid user mapping
> associated with it. That can be a huge cost as compared to the current
> solution where sub-optimal plan will be used only when a user mapping is
> changed while a statement has been prepared. That's a rare scenario and
> somebody can work around that by preparing the statement again.

I'm not sure that's a good workaround. ISTM that people often don't pay
much attention to plan changes, so they would execute the inefficient
plan without realizing the plan change, it would take long, they would
start thinking what's happening there, and finally, they would find that
the reason for that is due to the plan change. I think we should
prevent such a trouble.

> IIRC, we
> had discussed this situation when implementing the cache invalidation
> logic.

I didn't know that. Sorry for speaking up late.

> But there's no workaround for your solution.

As you said, this is a rare scenario; in many cases, people define user
mappings properly beforehand. So, just invalidating all relevant plans
on the syscache invalidation events would be fine. (I thought one
possible improvement might be to track exactly the dependencies of plans
on user mappings and invalidate just those plans that depend on the user
mapping being modified the same way for user-defined functions, but I'm
not sure it's worth complicating the code.)

> One benefit of using the proposed approach is that we could make the
> FDW's handling of user mappings in BeginForeignScan more efficient;
> currently, there is additional overhead caused by catalog re-lookups
> to obtain the user mapping information for the
> simple-foreign-table-scan case where user mappings mean something to
> the FDW as in postgres_fdw (probably, updates on the catalogs are
> infrequent, though), but we could improve the efficiency by using
> the validated user mapping information created at planning time for
> that case as well as for the foreign-join case.

> postgres_fdw to fetches user mapping in some cases but never remembers
> it. If what you are describing is a better way, it should have been
> implemented before join pushdown was implemented. Refetching a user
> mapping is not that expensive given that there is a high chance that it
> will be found in the syscache, because it was accessed at the time of
> planning.

That assumption is reasonably valid if execution is done immediately
after planning, but that doesn't necessarily follow.

> Effect of plan cache invalidation is probably worse than
> fetching the value from a sys cache again.

As I said above, we could expect updates on pg_user_mapping to be
infrequent, so the effect of the plan cache invalidation would be more
limited than that of re-fetching user mappings during BeginForeignScan.

> I don't think the above change is sufficient to fix the second. The
> root reason for that is that since currently, we allow the user
> mapping OID (rel->umid) to be InvalidOid in two cases: (1) user
> mappings mean something to the FDW but it can't get any user mapping
> at planning time and (2) user mappings are meaningless to the FDW,
> we cannot distinguish these two cases.

> The way to differentiate between these two is to look at the serverid.
> If server id is invalid it's the case 1,

Really? Maybe my explanation was not good, but consider a foreign join
plan created through GetForeignJoinPaths, by an FDW to which user
mappings are meaningless, like file_fdw. In that plan, the
corresponding server id would be valid, not invalid. No?

> So, I'd like to introduce a new callback routine to specify that
> user mappings mean something to the FDW as proposed by Tom [2], and
> use that to reject the former case, which allows us to set the above
> uses_user_mapping flag appropriately, ie, set the flag to true only
> if user mapping changes require forcing a replan.

> This routine is meaningless unless the core (or FDW) does not allow a
> user mapping to be created for such FDWs. Without that, core code would
> get confused as to what it should do when it sees a user mapping for an
> FDW which says user mappings are meaningless.

The core wouldn't care about such a user mapping for the FDW; the core
would just ignore the user mapping. No?

Best regards,
Etsuro Fujita

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2016-07-14 11:46:23 pg_xlogdump follow into the future
Previous Message Amit Kapila 2016-07-14 11:03:47 Re: Hash Indexes