Re: patch: SQL/MED(FDW) DDL

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Shigeru HANADA <hanada(at)metrosystems(dot)co(dot)jp>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, SAKAMOTO Masahiko <sakamoto(dot)masahiko(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: patch: SQL/MED(FDW) DDL
Date: 2010-10-05 15:38:58
Message-ID: AANLkTinEwo9kcE0FYaGATGmwurTN7AiWeQzZ1=oY3djc@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Oct 5, 2010 at 11:06 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Tue, Oct 5, 2010 at 10:41 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> (I'd also say that your performance estimate is miles in advance of any
>>> facts; but even if it's true, the caching ought to be inside the FDW,
>>> because we have no clear idea of what it will need to cache.)
>
>> I can't imagine how an FDW could possibly be expected to perform well
>> without some persistent local data storage.  Even assume the remote
>> end is PG.  To return a cost, it's going to need the contents of
>> pg_statistic cached locally, for each remote table.
>
> Or it could ask the remote side.

FWIW, I mentioned that option in that part you didn't quote.

>> Do you really
>> think it's going to work to incur that overhead once per table per
>> backend startup?
>
> If you have a cache, how are you going to manage updates of it?

I'm not. I'm going to let the FDW determine how often it would like
to refresh its cache, as well as what it would like to cache and in
what format it would like to cache it.

> IMO this is a *clear* case of premature optimization being the root of
> all evil.  We should get it working first and then see what needs to be
> optimized by measuring, rather than guessing in a vacuum.

I have no problem with punting the issue of remote statistics to some
time in the future. But I don't think we should have a half-baked
implementation of remote statistics. We should either do it right
(doing such testing as is necessary to establish what that means) or
not do it at all. Frankly, if we could get from where we are today to
a workable implementation of this technology for CSV files in time for
9.1, I think that would be an impressive accomplishment. Making it
work for more complicated cases is almost certainly material for 9.2,
9.3, 9.4, and maybe further out than that.

> (BTW, if the remote end is PG I would hardly think that we'd send SQL
> queries at all.  If we're interested in micro-optimization, we'd devise
> some lower-level protocol.)

Interesting.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2010-10-05 15:46:29 Re: standby registration (was: is sync rep stalled?)
Previous Message Peter Eisentraut 2010-10-05 15:37:22 Re: configure gaps