Re: Question about optimising (Postgres_)FDW

From: Hannu Krosing <hannu(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Question about optimising (Postgres_)FDW
Date: 2014-04-16 04:12:45
Message-ID: 534E033D.4020407@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 04/16/2014 01:25 AM, Tom Lane wrote:
> Hannu Krosing <hannu(at)2ndQuadrant(dot)com> writes:
>> Is there a way to force it to prefer a plan where the results of (select
>> id from onemillion where data > '0.9' limit 100)
>> are passed to FDW as a single IN ( = ANY(...)) query and are retrieved
>> all at once ?
> You could write the query like that:
>
> select * from onemillion_pgfdw where id = any (array(select id from
> onemillion where data > '0.9' limit 100));
My actual use-case was about a join between a local and a remote table
and without rewriting the query (they come from ORM)

I was hoping to be able to nudge postgresql towards a better plan via some
tuning of table/fdw options or GUCs.

for example, would postgresql use the WHERE id IN (...) query on remote
side for a query like

select r.data, l.data
from onemillion_pgfdw r
join onemillion l
on r.id = l.id and l.data > '0.999';

if it recognizes that the local side returns only 1000 rows ?

or would it still use 1000 individual WHERE id = $1 queries.

Is getting the foreign data via IN and then turning the data into a hash
for joining one of the plans it considers at all ?

Best
Hannu

>
> Or at least you should be able to, except when I try it I get
>
> explain analyze
> select * from onemillion_pgfdw where id = any (array(select id from
> onemillion where data > '0.9' limit 100));
> ERROR: operator does not exist: integer = integer[]
> HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts.
> CONTEXT: Remote SQL command: EXPLAIN SELECT id, inserted, data FROM public.onemillion WHERE ((id = ANY ((SELECT null::integer[]))))
>
> so there's something the remote-estimate code is getting wrong here.
> (It seems to work without remote_estimate, though.)
>
> regards, tom lane

--
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2014-04-16 04:27:10 Re: Clock sweep not caching enough B-Tree leaf pages?
Previous Message Kouhei Kaigai 2014-04-16 04:11:11 Re: Custom Scan APIs (Re: Custom Plan node)