Re: Horizontal scalability/sharding

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Ozgun Erdogan <ozgun(at)citusdata(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Mason S <masonlists(at)gmail(dot)com>, Oleg Bartunov <obartunov(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Horizontal scalability/sharding
Date: 2015-09-08 17:52:52
Message-ID: CA+TgmobppAWjs+JXv8CuDfpSBQRb_FcJVV9O508qoTyxkUOdBQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 4, 2015 at 6:52 PM, Ozgun Erdogan <ozgun(at)citusdata(dot)com> wrote:
> Distributed shuffles (Map/Reduce) are hard. When we looked at using FDWs for
> pg_shard, we thought that Map/Reduce would require a comprehensive revamp of
> the APIs.

Well, so you've said. But what kind of API do you want to see?
Taking control at some very high-level hook like ExecutorRun() is not
really a maintainable solution - it's fine if you've only got one guy
doing it, perhaps, but if you have several FDWs talking to different
kinds of remote systems, they can't all seize overall control.

>> For Citus, a second part of the question is as FDW writers. We implemented
> cstore_fdw, json_fdw, and mongo_fdw, and these wrappers don't benefit from
> even the simple join pushdown that doesn't require Map/Reduce.
>
> The PostgreSQL wiki lists 85 foreign data wrappers, and only 18 of these
> have support for joins:
> https://wiki.postgresql.org/wiki/Foreign_data_wrappers

What do you mean by "support for joins"? Do you mean that only 18 of
the remote data sources can do joins? If so, why does that matter?
I'd be quite happy if a join pushdown or "distributed shuffle" API had
as many as 18 users - I'd be quite happy if it had one (postgres_fdw).
The fact that not all FDWs can support every operation because of
limitations on the remote side isn't a reason not to support those
operations when the remote side is capable.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2015-09-08 17:54:49 Re: Separating Buffer LWlocks
Previous Message Robert Haas 2015-09-08 17:44:36 Re: exposing pg_controldata and pg_config as functions