Re: The plan for FDW-based sharding

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: The plan for FDW-based sharding
Date: 2016-02-24 01:08:29
Message-ID: CANP8+j+4Nwx-qdDAhvD5cYiLwFwHeBi1sn_2Aa3rC9C1Anjp0A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 23 February 2016 at 16:43, Bruce Momjian <bruce(at)momjian(dot)us> wrote:

> There was discussion at the FOSDEM/PGDay Developer Meeting
> (https://wiki.postgresql.org/wiki/FOSDEM/PGDay_2016_Developer_Meeting)
> about sharding so I wanted to outline where I think we are going with
> sharding and FDWs.
>

I think we need to be very careful to understand that "FDWs and Sharding"
is one tentative proposal amongst others, not a statement of direction for
the PostgreSQL project since there is not yet any universal agreement.

We know Postgres XC/XL works, and scales

Agreed.

In contrast, the FDW/sharding approach is as-yet unproven, and
significantly without any detailed technical discussion of the exact
approach and how it would work, even after more than 6 months since we
first heard of it openly. Since we don't know how it will work, we have no
idea how long it will take either, or even if it ever will.

I'd like to see discussion of the details in presentation/wiki form and an
initial prototype, with measurements. Without these things we are still
just at the speculation stage. Some alternate proposals are also at that
stage.

> , but we also know they require
> too many code changes to be merged into Postgres (at least based on
> previous discussions). The FDW sharding approach is to enhance the
> existing features of Postgres to allow as much sharding as possible.
>
> Once that is done, we can see what workloads it covers and
> decide if we are willing to copy the volume of code necessary
> to implement all supported Postgres XC or XL workloads.
> (The Postgres XL license now matches the Postgres license,
> http://www.postgres-xl.org/2015/07/license-change-and-9-5-merge/.
> Postgres XC has always used the Postgres license.)
>

It's never been our policy to try to include major projects in single code
drops. Any move of XL/XC code into PostgreSQL core would need to be done
piece by piece across many releases. XL is definitely too big for the
elephant to eat in one mouthful.

> If we are not willing to add code for the missing Postgres XC/XL
> features, Postgres XC/XL will probably remain a separate fork of
> Postgres.

And if the FDW approach doesn't work, that won't be part of PostgreSQL core
either...

> I don't think anyone knows the answer to this question, and I
> don't know how to find the answer except to keep going with our current
> FDW sharding approach.
>

This is exactly the wrong time to discuss this, since we are days away from
the final deadline for PostgreSQL 9.6 and the community should be focusing
on that for next few months, not futures.

What I notice is that when Greenplum announced it would publish as open
source its modified version of Postgres, there was some scary noise made
immediately about that concerning patents etc..

Now, Postgres-XL 9.5 is recently announced and we see another scary
sounding pronouncement about that *maybe* it won't be included in core.
While the comments made are true, they do not solely apply to XC/XL, in
fact the uncertainty applies to all approaches equally since notably we
have approximately five proposals for future designs.

These comments, given their timing and nature could easily cause "Fear,
Uncertainty and Doubt" in people seeing this. FUD is also the name of a
sales technique designed to undermine proposals. I hope and presume it was
not the intention and reason for discussing uncertainty now and earlier.

I'm glad to see that the viability of the XC/XL approach is recognized. The
fact we have a working solution now is important for users, who don't want
to wait the 3-5 years while we work out and implement a longer term
strategy. Future upgrade support is certain, however.

What eventually gets into PostgreSQL core is as yet uncertain, as is the
timescale, but my hope is that we recognize that multiple use cases can be
supported rather than a single fixed architecture. It seems likely to me
that the PostgreSQL project will do what it does best - take multiple
comments and merge those into a combined system that is better than any of
the individual single proposals.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2016-02-24 01:12:33 improving GROUP BY estimation
Previous Message Tomas Vondra 2016-02-24 00:13:22 Re: PATCH: index-only scans with partial indexes