Re: Proposal: Create index on foreign table

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal: Create index on foreign table
Date: 2012-03-20 19:39:36
Message-ID: CA+TgmobD9oy5Y3ksk5jmTnDS+aoE74jgWnsa86BqWd3YmDXRsA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 16, 2012 at 4:44 AM, Etsuro Fujita
<fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> I have a plan to support 'Create index on foreign table' for 9.3.  Here
> is my plan.
>
> The index creation is supported for a flat file such as CSV and a remote
> table on a RDB e.g., Postgres using CREATE INDEX.  (I thought using a
> new statement, CREATE FOREIGN INDEX, at first, but I think that CREATE
> INDEX would be sufficient to define an index for the foreign table.)
> For a flat file, CREATE INDEX constructs an index in the same way as an
> index for a regular table.

It seems really weird to have the data half inside the database and
half outside of it like this. What is the use case for this feature?
I would have thought that something like file_fdw would be primarily
targeted toward use cases where you intend to read the data file only
a few times, or you always intend to sequential scan it. If you need
to index it, why not store the data in a regular table? That's what
they are for. What you're proposing sounds to me like a lot of work
for no real gain.

> On the other hand, for a remote table,
> CREATE INDEX collects information about the index on the specified
> column(s) for the specified table that was created on the remote table.

I can possibly see the point of this. Having local information about
which remote indexes are available seems like it could be useful. As
Heikki says, you could cache it on a per-session basis, but that might
not be very efficient.

I also think that it would not be a very good idea to have CREATE
INDEX on a foreign table sometimes really create an index and other
times just define the properties of a remote index. If we're going to
have both features at all, I think they should use different syntax.
I suggest that CREATE INDEX or CREATE FOREIGN INDEX is the right way
to provide the query planner with information about remote-side
indexes; and that if we even want to have indexes on flat files, the
interface to those should be exported via functions bundled in the
file_fdw extension rather than DDL. There's no reason to suppose that
the indexes the FDW supports correspond to PostgreSQL's AMs, so tying
it into that framework doesn't seem wise.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Stark 2012-03-20 19:41:51 Re: Memory usage during sorting
Previous Message Robert Haas 2012-03-20 19:15:46 Re: [v9.2] Add GUC sepgsql.client_label