Re: EvalPlanQual behaves oddly for FDW queries involving system columns

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, ashutosh(dot)bapat(at)enterprisedb(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: EvalPlanQual behaves oddly for FDW queries involving system columns
Date: 2015-05-11 22:42:08
Message-ID: 25332.1431384128@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On further consideration ...

I don't much like the division of labor between LockForeignRow and
FetchForeignRow. In principle that would lead to not one but two
extra remote accesses per locked row in SELECT FOR UPDATE, at least
in the case that an EvalPlanQual recheck is required. (I see that
in your prototype patch for postgres_fdw you attempt to avoid that
by saving a retrieved tuple in LockForeignRow and then returning it
in FetchForeignRow, but that seems extremely ugly and bug-prone,
since there is nothing in the API spec insisting that those calls match
up one-to-one.) The fact that locking and fetching a tuple are separate
operations in the heapam API is a historical artifact that probably
doesn't make sense to duplicate in the FDW API.

Another problem is that we fail to detect whether an EvalPlanQual recheck
is required during a SELECT FOR UPDATE on a remote table, which we
certainly ought to do if the objective is to make postgres_fdw semantics
match the local ones.

What I'm inclined to do is merge LockForeignRow and FetchForeignRow
into one operation, which would have the semantics of returning a
palloc'd tuple, or NULL on a SKIP LOCKED failure, and it would be expected
to acquire a lock according to erm->markType (in particular, no lock just
a re-fetch for ROW_MARK_REFERENCE). In addition it needs some way of
reporting that the returned row is an updated version rather than the
original. Probably just an extra output boolean would do for that.

This'd require some refactoring in nodeLockRows, because we'd want to
be able to hang onto the returned tuple without necessarily provoking
an EvalPlanQual cycle, but it doesn't look too bad.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joshua D. Drake 2015-05-11 22:42:26 Re: Multi-xacts and our process problem
Previous Message Alvaro Herrera 2015-05-11 22:25:07 Re: deparsing utility commands