Re: Horizontal scalability/sharding

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Mason S <masonlists(at)gmail(dot)com>, Oleg Bartunov <obartunov(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Horizontal scalability/sharding
Date: 2015-08-31 21:47:12
Message-ID: CA+TgmoZCPeHjhMcifNoZA9e5MZ+9QB=KF2kPtw9ju51cKqh9hQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Aug 31, 2015 at 4:16 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> First, let me put out there that I think the horizontal scaling project
> which has buy-in from the community and we're working on is infinitely
> better than the one we're not working on or is an underresourced fork.
> So we're in agreement on that. However, I think there's a lot of room
> for discussion; I feel like the FDW approach was decided in exclusive
> meetings involving a very small number of people. The FDW approach
> *may* be the right approach, but I'd like to see some rigorous
> questioning of that before it's final.

It seems to me that sharding consists of (1) breaking your data set up
into shards, (2) possibly replicating some of those shards onto
multiple machines, and then (3) being able to access the remote data
from local queries. As far as (1) is concerned, we need declarative
partitioning, which is being worked on by Amit Langote. As far as (2)
is concerned, I hope and expect BDR, or technology derived therefrom,
to eventually fill that need. As far as (3) is concerned, why
wouldn't we use the foreign data wrapper interface, and specifically
postgres_fdw? That interface was designed for the explicit purpose of
allowing access to remote data sources, and a lot of work has been put
into it, so it would be highly surprising if we decided to throw that
away and develop something completely new from the ground up.

It's true that postgres_fdw doesn't do everything we need yet. The
new join pushdown hooks aren't used by postgres_fdw yet, and the API
itself has some bugs with EvalPlanQual handling. Aggregate pushdown
is waiting on upper planner path-ification. DML pushdown doesn't
exist yet, and the hooks that would enable pushdown of ORDER BY
clauses to the remote side aren't being used by postgres_fdw. But all
of these things have been worked on. Patches for many of them have
already been posted. They have suffered from a certain amount of
neglect by senior hackers, and perhaps also from a shortage of time on
the part of the authors. But an awful lot of the work that is needed
here has already been done, if only we could get it committed.
Aggregate pushdown is a notable exception, but abandoning the foreign
data wrapper approach in favor of something else won't fix that.

Postgres-XC developed a purpose-built system for talking to other
nodes instead of using the FDW interface, for the very good reason
that the FDW interface did not yet exist at the time that Postgres-XC
was created. But several people associated with the XC project have
said, including one on this thread, that if it had existed, they
probably would have used it. And it's hard to see why you wouldn't:
with XC's approach, the remote data source is presumed to be
PostgreSQL (or Postgres-XC/XL/X2/whatever); and you can only use the
facility as part of a sharding solution. The FDW interface can talk
to anything, and it can be used for stuff other than sharding, like
making one remote table appear local because you just happen to want
that for some reason. This makes the XC approach look rather brittle
by comparison. I don't blame the XC folks for taking the shortest
path between two points, but FDWs are better, and we ought to try to
leverage that.

> Particularly, I'm concerned that we already have two projects in process
> aimed at horizontal scalability, and it seems like we could bring either
> (or both) projects to production quality MUCH faster than we could make
> an FDW-based solution work. These are:
>
> * pg_shard
> * BDR
>
> It seems worthwhile, just as a thought experiment, if we can get where
> we want using those, faster, or by combining those with new FDW features.

I think it's abundantly clear that we need a logical replication
solution as part of any horizontal scalability story. People will
want to do things like have 10 machines with each piece of data on 3
of them, and there won't be any reasonable way of doing that without
logical replication. I assume that BDR, or some technology derived
from it, will end up in core and solve that problem. I had actually
hoped we were going to get that in 9.5, but it didn't happen that way.
Still, I think that getting first single-master, and then eventually
multi-master, logical replication in core is absolutely critical. And
not just for sharding specifically: replicating your whole database to
several nodes and load-balancing your clients across them isn't
sharding, but it does give you read scalability and is a good fit for
people with geographically dispersed data with good geographical
locality. I think a lot of people will want that.

I'm not quite sure yet how we can marry declarative partitioning and
better FDW-pushdown and logical replication into one seamless, easy to
deploy solution that requires very low administrator effort. But I am
sure that each of those things, taken individually, is very useful,
and that being able to construct a solution from those building blocks
would be a big improvement over what we have today. I can't imagine
that trying to do one monolithic project that provides all of those
things, but only if you combine them in the specific way that the
designer had in mind, is ever going to be successful. People _will_
want access to each of those features in an unbundled fashion. And,
trying to do them altogether leads to trying to solve too many
problems at once. I think the history of Postgres-XC is a cautionary
tale there.

I don't really understand how pg_shard fits into this equation. It
looks to me like it does some interesting things but, for example, it
doesn't support JOIN pushdown, and suggests that you use the
proprietary CitusDB engine if you need that. But I think JOIN
pushdown is something we want to have in core, not something where we
want to point people to proprietary alternatives. And it has some
restrictions on INSERT statements - they have to contain only values
which are constants or which can be folded to constants. I'm just
guessing, but I bet that's probably due to some limitation which
pg_shard, being out of core, has difficulty overcoming, but we can do
better in core. Basically I guess I expect much of what pg_shard does
to be subsumed as we improve FDWs, but maybe not all of it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2015-08-31 22:18:25 Re: pg_upgrade + Extensions
Previous Message Bruce Momjian 2015-08-31 21:46:08 Re: security labels on databases are bad for dump & restore