Re: The plan for FDW-based sharding

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: The plan for FDW-based sharding
Date: 2016-02-24 02:19:23
Message-ID: 20160224021923.GA12198@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 24, 2016 at 01:08:29AM +0000, Simon Riggs wrote:
> On 23 February 2016 at 16:43, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>
> There was discussion at the FOSDEM/PGDay Developer Meeting
> (https://wiki.postgresql.org/wiki/FOSDEM/PGDay_2016_Developer_Meeting)
> about sharding so I wanted to outline where I think we are going with
> sharding and FDWs.
>
> I think we need to be very careful to understand that "FDWs and Sharding" is
> one tentative proposal amongst others, not a statement of direction for the
--------------

What other directions are proposed to add sharding to the existing
Postgres code? If there are, I have not heard of them. Or are they
only (regularly updated?) forks of Postgres?

> PostgreSQL project since there is not yet any universal agreement.

As I stated clearly, we are going in the FDW direction because improving
FDWs have uses beyond sharding, and once it is done we can see how well
it works for sharding.

> We know Postgres XC/XL works, and scales
>
>
> Agreed. 
>
> In contrast, the FDW/sharding approach is as-yet unproven, and significantly
> without any detailed technical discussion of the exact approach and how it
> would work, even after more than 6 months since we first heard of it openly.
> Since we don't know how it will work, we have no idea how long it will take
> either, or even if it ever will.

Yep.

> I'd like to see discussion of the details in presentation/wiki form and an
> initial prototype, with measurements. Without these things we are still just at
> the speculation stage. Some alternate proposals are also at that stage.

Uh, what "alternate proposals"?

My point was that we know XC/XL works, but there is too much code change
for us, so maybe FDWs will make built-in sharding possible/easier.

> , but we also know they require
> too many code changes to be merged into Postgres (at least based on
> previous discussions).  The FDW sharding approach is to enhance the
> existing features of Postgres to allow as much sharding as possible.
>
> Once that is done, we can see what workloads it covers and
> decide if we are willing to copy the volume of code necessary
> to implement all supported Postgres XC or XL workloads.
> (The Postgres XL license now matches the Postgres license,
> http://www.postgres-xl.org/2015/07/license-change-and-9-5-merge/.
> Postgres XC has always used the Postgres license.)
>
>
> It's never been our policy to try to include major projects in single code
> drops. Any move of XL/XC code into PostgreSQL core would need to be done piece
> by piece across many releases. XL is definitely too big for the elephant to eat
> in one mouthful.

Is there any plan to move the XL/XC code into Postgres? If so, I have
not heard of it. I thought everyone agreed it was too much code change,
which is why it is a separate code tree. Is that incorrect?

> If we are not willing to add code for the missing Postgres XC/XL
> features, Postgres XC/XL will probably remain a separate fork of
> Postgres. 
>
>
> And if the FDW approach doesn't work, that won't be part of PostgreSQL core
> either...

Uh, duh. Yeah, that's what I said. What is your point? I said we
don't know if it will work, as you quoted below:

> I don't think anyone knows the answer to this question, and I
> don't know how to find the answer except to keep going with our current
> FDW sharding approach.
>
>
> This is exactly the wrong time to discuss this, since we are days away from the
> final deadline for PostgreSQL 9.6 and the community should be focusing on that
> for next few months, not futures.

I posted this because of the discussion at the FOSDEM meeting, and to
address the questions you asked in that meeting. I even told you last
week on IM that I was going to post this for that stated purpose. I
didn't pick the time at random.

> What I notice is that when Greenplum announced it would publish as open source
> its modified version of Postgres, there was some scary noise made immediately
> about that concerning patents etc..

> Now, Postgres-XL 9.5 is recently announced and we see another scary sounding
> pronouncement about that *maybe* it won't be included in core. While the
> comments made are true, they do not solely apply to XC/XL, in fact the
> uncertainty applies to all approaches equally since notably we have
> approximately five proposals for future designs.
>
> These comments, given their timing and nature could easily cause "Fear,
> Uncertainty and Doubt" in people seeing this. FUD is also the name of a sales
> technique designed to undermine proposals. I hope and presume it was not the
> intention and reason for discussing uncertainty now and earlier.

Oh, I absolutely did this as a way to undermine what _everyone_ else is
doing? Is there another way to behave?

I find this insulting. Others made the same remarks when I questioned
the patents, and earlier when I questioned if we would integrate the
Greenplum code after their press release. And you know what, we didn't
want the Greenplum code (yet), and I explained how open source code with
patents is riskier than closed-source code with patents, and I think
people finally understood that, including you.

When people don't like what I have to say, they figure their must be
some other motive, because I certainly couldn't think this on my own?
Really? Have I not been around long enough for people to realize that
is not the case!

If you _presume_ did not have some undermining motivation for posting
this, why did you mention it? You obviously _do_ think I have some
external motivation for talking about FDWs now or you wouldn't have
mentioned it. (I can't even think of what the motivation would be.)

Let me come out and say what people might be thinking: I realize it is
unfortunate that _if_ FDWs succeed in sharding, the value of the work
done on Postgres XC/XL will be diminished. I personally think that
Postgres needs a built-in sharding solution, just like I thought we
needed a native Windows port, in-place upgrade, and parallelism. I was
hopeful XC/XL could be integrated into Postgres, but based on
discussions, it seems that is not acceptable, so the FDW/sharding
approach is that only built-in one I can think of. Are there other
possibilities?

I talk about it and try to get people excited about it. I make no
apologies for that. I will talk about this forever, or as long as
people will listen, so you can expect to hear about it. I am sure I
will think of other "crazy" things to talk about too because the other
items I mentioned above were also considered odd/crazy at the time I
proposed them.

> I'm glad to see that the viability of the XC/XL approach is recognized. The
> fact we have a working solution now is important for users, who don't want to
> wait the 3-5 years while we work out and implement a longer term strategy.
> Future upgrade support is certain, however.

Yes, no question. The benchmarks of XC/XL looked amazing. Can you
remind me of the URLs for that? Do you have any new ones?

In a way, I don't see any need for an FDW sharding prototype because, as
I said, we already know XC/XL work, so copying what they do doesn't
help. What we need to know is if we can get near the XC/XL benchmarks
with an acceptable addition of code, which is what I thought I already
said. Perhaps this can be done with FDWs, or some other approach I have
not heard of yet.

> What eventually gets into PostgreSQL core is as yet uncertain, as is the
> timescale, but my hope is that we recognize that multiple use cases can be
> supported rather than a single fixed architecture. It seems likely to me that
> the PostgreSQL project will do what it does best - take multiple comments and
> merge those into a combined system that is better than any of the individual
> single proposals.

Agreed.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Roman grave inscription +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2016-02-24 03:47:18 Re: GIN data corruption bug(s) in 9.6devel
Previous Message Craig Ringer 2016-02-24 02:14:50 Re: [HACKERS] JDBC behaviour