Re: Join push-down support for foreign tables

From: Shigeru HANADA <shigeru(dot)hanada(at)gmail(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Join push-down support for foreign tables
Date: 2014-09-07 23:27:26
Message-ID: 540CE9DE.207@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

(2014/09/05 0:56), Bruce Momjian wrote:> On Thu, Sep 4, 2014 at
08:41:43PM +0530, Atri Sharma wrote:
>> On Thursday, September 4, 2014, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>>
>> On Thu, Sep 4, 2014 at 08:37:08AM -0400, Robert Haas wrote:
>> > The main problem I see here is that accurate costing may
require a
>> > round-trip to the remote server. If there is only one path
that is
>> > probably OK; the cost of asking the question will usually be
more than
>> > paid for by hearing that the pushed-down join clobbers the other
>> > possible methods of executing the query. But if there are
many paths,
>> > for example because there are multiple sets of useful
pathkeys, it
>> > might start to get a bit expensive.
>> >
>> > Probably both the initial cost and final cost calculations
should be
>> > delegated to the FDW, but maybe within postgres_fdw, the
initial cost
>> > should do only the work that can be done without contacting
the remote
>> > server; then, let the final cost step do that if appropriate.
But I'm
>> > not entirely sure what is best here.
>>
>> I am thinking eventually we will need to cache the foreign server
>> statistics on the local server.
>>
>>
>>
>>
>> Wouldn't that lead to issues where the statistics get outdated and
we have to
>> anyways query the foreign server before planning any joins? Or are
you thinking
>> of dropping the foreign table statistics once the foreign join is
complete?
>
> I am thinking we would eventually have to cache the statistics, then get
> some kind of invalidation message from the foreign server. I am also
> thinking that cache would have to be global across all backends, I guess
> similar to our invalidation cache.

If a FDW needs to know more information than pg_statistics and pg_class
have, yes, it should cache some statistics on the local side. But such
statistics would have FDW-specific shape so it would be hard to have API
to manage. FDW can have their own functions and tables to manage their
own statistics, and it can have even background-worker for messaging.
But it would be another story.

Regards,
--
Shigeru HANADA

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Kouhei Kaigai 2014-09-08 00:10:57 what data type should be returned by sum(float4)
Previous Message Shigeru HANADA 2014-09-07 23:07:59 Re: Join push-down support for foreign tables