Re: Agenda for the Vienna cluster meeting

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Oleg Bartunov <obartunov(at)gmail(dot)com>
Cc: PostgreSQL-cluster development <pgsql-cluster-hackers(at)postgresql(dot)org>
Subject: Re: Agenda for the Vienna cluster meeting
Date: 2015-10-13 17:06:57
Message-ID: 20151013170657.GA7842@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-cluster-hackers

On Sat, Oct 10, 2015 at 11:32:14PM +0300, Oleg Bartunov wrote:
> What is the goal of this summit,  expected result ?
>
> We have XC/XL/X2, Citus DB, EDB groups and I'd certainly interest to know state
> of art of their sharding design and implementation. We'll present our proposal
> of DTM and its implementation with examples of integrations with FDW, pg_shard 
> and probably XL.  Our goal is to discuss API with all groups and eventually
> convince community to accept it for 9.6. That would make development of
> different approaches more easy.

The goal of the meeting is to discuss the possibility of adding built-in
sharding to Postgres. I think this is similar to how we added built-in
replication --- we first implemented external replication solutions, but
once PITR was sufficiently developed, we enhanced it to implement
streaming replication. It took us a few years to get PITR fully
developed, and then a few years to get streaming replication fully
developed --- built-in sharding will probably follow a similar path.

I think with FDWs and parallelism, we are nearing a point where built-in
sharding is a viable approach. It is only viable if the backend changes
and additions are minimal. (There is little community desire to add
tons of new code just to implement sharding.) With FDWs and
parallelism, we can get sharding by enhancing them. We already know we
need a better user partitioning API, so that could benefit sharding too.
A distributed transaction manager is another missing piece, but that
could benefit FDWs too, and other sharding implementations, as you
mentioned.

Streaming replication didn't make external replication solutions
disappear, but for the majority of users built-in replication was the
best approach. I think the same will happen with sharding. I don't
think maintaining a sharding patch set on top of Postgres is a viable
long-term approach, though it has short-term advantages.

Please don't label this as an EDB approach --- I think that is just
divisive. Yes, EDB has customers who want this, and EDB and NTT are
funding some of the development, but many large Postgres users need this
too, and many Postgres service providers have customers who need this.

If you want to label it, call it my approach. I was crazy enough to
lead the Windows port, and crazy enough to get pg_upgrade to production
quality --- hopefully I am crazy enough to get this done too. ;-) This
approach is different from all the ones you listed above because it has
a realistic chance of enabling _built-in_ sharding, and I think built-in
sharding is the only long-term viable mass-adoption solution, just like
streaming replication was for replication.

I have created a wiki agenda that we can all adjust before the meeting:

https://wiki.postgresql.org/wiki/PG-EU_2015_Cluster_Summit

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Roman grave inscription +

In response to

Browse pgsql-cluster-hackers by date

  From Date Subject
Next Message Bruce Momjian 2015-11-04 12:58:52 Summary of Vienna sharding summit, new TODO item
Previous Message Oleg Bartunov 2015-10-10 20:32:14 Re: Agenda for the Vienna cluster meeting