Re: pgsql_fdw, FDW for PostgreSQL server

From: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
To: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: pgsql_fdw, FDW for PostgreSQL server
Date: 2011-11-28 11:50:27
Message-ID: 4ED37583.8020800@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Fujita-san,

(2011/11/25 17:27), Etsuro Fujita wrote:
> I'm still under reviewing, so the following is not all. I'm sorry.
> estimate_costs() have been implemented to ask a remote postgres server
> for the result of EXPLAIN for a remote query to get its costs such as
> startup_cost and total_cost. I think this approach is the most accurate
> way to get its costs. However, I think it would be rather costly. And
> I'm afraid of that it might work only for pgsql_fdw.

Indeed. In addition, this approach assumes that cost factors of target
PG server are same as local's ones. pgsql_fdw might have to have cost
factors as FDW options of foreign server.

> Because, even if we
> are able to obtain such a cost information by EXPLAINing a remote query
> at a remote server where a DBMS different from postgres runs, it might
> be difficult to incorporate such a cost information with the postgres
> cost model due to their possible inconsistency that such a cost
> information provided by the EXPLAIN command in the other DBMS might have
> different meanings (or different scales) from that provided by the
> EXPLAIN command in postgres.

Yes, so implementing cost estimation for other DBMSs accurately would be
very difficult, but AFAIS rows estimation is the most important factor,
so reasonable row count and relatively high startup cost would produce
not-so-bad plan.

> So, I think it might be better to estimate
> such costs by pgsql_fdw itself without EXPLAINing on the assumption that
> a remote postgres server has the same abilities for query optimization,
> which is less costly and widely applicable to the other DBMSs, while it,
> of course, only works once we have statistics and/or index information
> for foreign tables. But AFAIK we eventually want to have those, so I'd
> like to propose to use the proposed approach until that time.

Knowledge of foreign indexes also provide information of sort order.
Planner will be able to consider merge join without local sort with such
information. Without foreign index, we have to enumerate possible sort
keys with Blute-Force approach for same result, as mentioned by
Itagaki-san before.

Regards,
--
Shigeru Hanada

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2011-11-28 12:21:02 Re: Disable OpenSSL compression
Previous Message Jan Urbański 2011-11-28 11:09:44 Re: splitting plpython into smaller parts