We have just newly released PGSpider extension(pgspider_ext).
This is an extension to construct High-Performance SQL Cluster Engine for distributed big data.
PGSpider enables PostgreSQL to access a number of data sources using Foreign Data Wrapper(FDW) and retrieves the distributed data source vertically.
The main feature is:
* Node partitioned table
User can get records in multi tables on some data sources by one SQL easily.
If there are 2 data sources which have the following records:
SELECT * FROM t1_node1; -- @node1 i | t ----+--- 10 | a 11 | b (2 rows) SELECT * FROM t1_node2; -- @node2 i | t ----+--- 20 | c 21 | d (2 rows)
PGSpider enables to collect these records with node identifier column like:
SELECT * FROM t1; i | t | node ----+---+------- 10 | a | node1 11 | b | node1 20 | c | node2 21 | d | node2 (4 rows)
PGSpider can fetch results from data sources in parallel.
PGSpider can pushdown WHERE clause and aggregation functions to data sources.
The shippability depends on datasource FDW.
This is developed by Toshiba Software Engineering & Technology Center.
Source repository : https://github.com/pgspider/pgspider_ext