Joins between foreign tables

From: Jason Dusek <jason(dot)dusek(at)gmail(dot)com>
To: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Joins between foreign tables
Date: 2015-06-08 16:45:16
Message-ID: CAO3NbwO4jDnXc=ihQ1dbm=wRwJfAYoXdVGJSAaJoXVYiohfdqQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Dear Postgres,

Consider a query like:

SELECT table_on_server1.email
FROM table_on_server1 JOIN table_on_server2
ON (table_on_server1.id = table_on_server2.user_id)
WHERE table_on_server2.created_at >= date_trunc('day', now())

One could imagine the plan being something like:

1.

Find all the user_ids that result from SELECT user_id FROM
table_on_server2 WHERE table_on_server2.created_at >= date_trunc('day',
now())
2.

Pass these IDs to a query executed on server1, pushing them down in a
WHERE clause.

However, the query actually doesn’t perform at all like that. Which is to
say, if one executes the query in (1) from the console and copy pastes the
resulting IDs into an IN clause in a second query against table_on_server1,
the query returns quickly (milliseconds); whereas if one runs the naive
query at the beginning of this email, it does not return for seconds.

What are things we can do to get good performance for queries like this?
We’ve tied moving the search for IDs into a temp table and a CTE; it
doesn’t seem to make any difference. (Which suggests that joins between one
local and one remote table won’t perform well in general, either.)

Best Regards,

Jason Dusek

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2015-06-08 16:58:48 Re: Momentary Delay
Previous Message Andres Freund 2015-06-08 16:37:07 Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1