Quick Links

Re: SQL/MED - core functionality

From:	Shigeru HANADA <hanada(at)metrosystems(dot)co(dot)jp>
To:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: SQL/MED - core functionality
Date:	2010-12-24 10:51:27
Message-ID:	20101224195126.600C.6989961C@metrosystems.co.jp
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, 21 Dec 2010 19:33:04 +0000
Simon Riggs <simon(at)2ndQuadrant(dot)com> wrote:
> 1. The docs don't actually say what a foreign table is. Is it a local
> representation of foreign data? Or a local copy of foreign data? Or is
> it a table created on a remote node?

"Foreign table" is an database object which represents the format of
existing external data as PG-table, and it can be used as data source
of SELECT statement. It is like a VIEW rather than a TABLE because
FOREIGN TABLE doesn't have any data locally.

> 2. Will CREATE FOREIGN TABLE require a transactionid? It seems a good
> replacement for temp tables on Hot Standby to be able to run a CREATE
> FOREIGN TABLE using the file_fdw, then reuse the file again later.

AFAIK Yes. CREATE FOREIGN TABLE make records in catalogs (pg_class,
pg_attribute, etc.).

> 3. Do we support CREATE TEMP FOREIGN TABLE? It seems desirable to be
> able to move data around temporarily, as we do with normal tables.

When we support write access to foreign tables, it would be useful.

> 4. In Hot Standby, we are creating many copies of the data tables on
> different servers. That seems to break the concept that data is in only
> one place, when we assume that a foreign table is on only one foreign
> server. How will we represent the concept that data is potentially
> available identically from more than one place? Any other comments about
> how this will work with Hot Standby?

IIUC, If you create FOREIGN TABLE on primary node, it will be
propagated to standby nodes with same generic options. Then, users
connected to standby nodes can use the foreign tables to retrieve
foreign data. If you have multiple standby nodes, all foreign tables
on all nodes including primary node would refer one data source.

For example, file_fdw would read data from the file pointed by
"filename" option, but you can't change the path for each standby
server. You may copy the file to each standby servers, or share one
disk which contains the file by all servers.

OTOH, RDBMS wrappers would refer same server if the SERVER, USER
MAPPING and FOREIGN TABLE have same generic options (host, port,
dbname, etc.), so you would need just one data instance for all of
FOREIGN TABLEs on standby nodes, and data consistency might have to be
checked on the remote side when the data is being changed.

# Um, I might have missed your point...

> 5. In PL/Proxy, we have the concept that a table is sharded across
> multiple nodes. Is that possible here? Again, we seem to have the
> concept that a table is only ever in a single place.

You would able to point one data source from multiple foreign tables
on different PG-nodes.

> 6. Can we do CREATE FOREIGN TABLE .... AS SELECT ...
> I guess the answer depends on (1)

We might be able to support that syntax, but IMHO it doesn't seem too
useful.

> 7. Why does ANALYZE skip foreign tables? Surely its really important we
> know things about a foreign table, otherwise we are going to optimize
> things very badly.

I think ANALYZE is good timing to get statistics of remote data.
In current design, planner calls PlanRelScan() to get costs
(startup/total) of the scan, but it seems difficult to estimate
rows/width by each FDW. I think acquire_sample_rows() would be the
hook point for that purpose. Then, "how to get random sample rows"
would be FDW's matter, but I have not found smart way to acquire
samples without sequential scan on the remote side...

> 8. Is the WHERE clause passed down into a ForeignScan?

Parsed WHERE clause is passed to PlanRelScan() via baserestrictinfo of
RelOptInfo. Wrappers would be able to push it (or part of it) down to
the remote side. Maybe RDBMS wrappers need to implement deparsing
routine similar to deparse_expression() or ri_GenerateQual() for
themselves.

> 9. The docs for CHECK constraints imply that the CHECK is executed
> against any rows returned from FDW. Are we really going to execute that
> as an additional filter on each row retrieved?

In current implementation, CHECK/NOT NULL constraints are not executed,
and I'm not sure that they should be. NOT NULL and CHECK are
supported for table inheritance mainly.

> 10. Can a foreign table be referenced by a FK?

Currently no. FK requires PK on the referenced table, but foreign
table can't have PK constraint.

> 11. Can you create a DO INSTEAD trigger on a FOREIGN TABLE?

Currently no, but it would be useful.

> 12. I think it would be useful for both review and afterwards to write
> the documentation section now, so we can begin to understand this. Will
> there be a documentation section on writing a FDW also? There are enough
> open questions here that I think we need docs and a review guide,
> otherwise we'll end up with some weird missing feature, which would be a
> great shame.

Agreed. ISTM that "V. Server Programming" section is suitable.

> 13. How does this relate to dblink? Is that going to be replaced by this
> feature?

They would be independent each other in first version, and dblink
would have to be maintained because there would be many users.

> 14. How do we do scrollable cursors with foreign tables? Do we
> materialize them always? Or...

We materialize the result as same as normal tables.

> 15. In terms of planning queries, do we have a concept of additional
> cost per row on a foreign server? How does the planner decide how costly
> retrieving data is from the FDW?

Costs for a scan on a foreign table is estimated in FDW routine
PlanRelScan(). So FDW can use arbitrary algorithm to estimate costs.
pgsql_fdw might execute "EXPLAIN SELECT ... FROM xxx" on remote side
to get remote costs.

> 16. If we cancel a query, is there an API call to send query cancel to
> the FDW and so on to the foreign server? Does that still work if we hot
> other kinds of ERROR, or FATAL?

There is no handler for query cancel. If FDW wants cleanup on the
interrupts, resourceowner mechanism would help.

> 17. Can we request different types of transaction isolation on the
> foreign server, or do certain states get passed through from our
> session? e.g. if we are running a serializable transaction, does that
> state get passed through to the FDW, so it knows to request that on the
> foreign server? That seems essential if we are going to make pg_dump
> work correctly.

Currently there is no transaction management hook.

> 18. Does pg_dump dump the data in the FDW or just of the definition of
> the data? Can we have an option for either?

I think that pg_dump need to dump only definition of foreign tables
because 1) multiple foreign tables on different nodes can refer one data
source, and 2) we can't load data to foreign table.

> 19. If we PREPARE a statement, are there API calls to pass thru the
> PREPARE to the FDW? Or are calls always dynamic?

PREPARE calls PlanRelScan() for each foreign tables used in the query,
and EXECUTE calls BeginScan(), Iterate(), EndScan(). Parameters of
EXECUTE statement are passed to Iterate().

> 20. If default privileges include INSERT, UPDATE or DELETE, does this
> cause error, or does it silently get ignored for foreign tables? I think
> I would want the latter.

In current implementation, all default privileges are granted to
foreign table if the owner has default privileges. It breaks
consistency, will fix to ignore privileges which can't be granted via
GRANT statement.

> 21. Can we LOCK a foreign table? I guess so. Presumably no LOCK is
> passed through to the FDW?

LOCK statement can be used on foreign tables, but FDW can't handle the
statement. You mean that locking remote data through LOCK statement?
It would be better to discuss locking with write-access support.

> 22. Can we build an local index on a foreign table?

No. It would be difficult to determine what we should store abstract
"tuple id", ctid for heap tuple, in index tuples.

Regards,
--
Shigeru Hanada

In response to

Re: SQL/MED - core functionality at 2010-12-21 19:33:04 from Simon Riggs

Responses

Re: SQL/MED - core functionality at 2010-12-24 11:34:59 from Simon Riggs

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Shigeru HANADA	2010-12-24 11:04:45	Re: SQL/MED - file_fdw
Previous Message	Nicolas Barbier	2010-12-24 10:23:50	Re: proposal : cross-column stats