Re: A Modest Upgrade Proposal

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: A Modest Upgrade Proposal
Date: 2016-07-14 06:06:31
Message-ID: CAMsr+YGXJJhwyrK=NZyT8aoTdQzQfvUQyFfZdobQyLt6zJVAPQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 14 July 2016 at 03:06, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> Physical replication has
> the same issue. Users don't want to configure archive_command and
> wal_keep_segments and max_wal_senders and wal_level and set up an
> archive and create recovery.conf on the standby. They want to spin up
> a new standby - and we don't provide any way to just do that. [...snip...]

Similarly, when the master fails, users want to promote a
> standby (either one they choose or the one that is determined to be
> furthest ahead) and remaster the others and that's not something you
> can "just do".
>

Oh, I absolutely agree. But that's some pretty epic scope creep, and
weren't you just saying we should cut logical replication to the bone to
get the bare minimum in, letting users deal with keeping table definitions
in sync, etc? We've got a development process where it takes a year to get
even small changes in - mostly for good reasons, but it means it makes
little sense to tie one feature to much bigger ones.

I often feel like with PostgreSQL we give users a box and some assembly
instructions, rather than a complete system. But rather than getting a bike
in a box with a manual, you get the frame in the box, and a really good
manual on how to use the frame, plus some notes to take a look elsewhere to
find wheels, brakes, and a seat, plus an incomplete list of eight different
wheel, brake and seat types. Many of which won't work well together or only
work for some conditions.

But damn, we make a good bike frame, and we document the exact stress
tolerances of the forks!

Similarly, for logical replication, users will want to do things like
> (1) spin up a new logical replication slave out of thin air,
> replicating an entire database or several databases or selected
> replication sets within selected databases; or (2) subscribe an
> existing database to another server, replicating an entire database or
> several databases; or (3) repoint an existing subscription at a new
> server after a master change or dump/reload, resynchronizing table
> contents if necessary; or (4) stop replication, either with or without
> dropping the local copies of the replicated tables. (This is not an
> exhaustive list, I'm sure.)
>

Yep, and all of that's currently either fiddly or impossible.

To do some of those things even remotely well takes a massive amount more
infrastructure though. Lots of people here will dismiss that, like they
always do for things like connection pooling, by saying "an external tool
can do that". Yeah, it can, but it sucks, you get eight different tools
that each solve 80% of the problem (a different subset each), with erratic
docs and maintenance and plenty of bugs. But OTOH even if we all agreed Pg
should have magic self-healing auto-provisioning auto-discovering
auto-scaling logical replication magic, there's a long path from that to
delivering even the basics. Look at how long 2ndQ people have been working
on just getting the basic low level mechanisms in place. There have been
process issues there too, but most of it comes down to sheer scale and the
difficulty of doing it in a co-ordinated, community friendly way that
produces a robust result.

In addition to host management, you've also got little things like a way to
dump schemas from multiple DBs and unify them in a predictable, consistent
way, then keep them up to date as the schemas on each upstream change.
While blocking changes that can't be merged into the downstream or allowing
the downstream to fail. Since right now our schema copy mechanism is "run
this binary and feed the SQL it produces into the other DB" we're rather a
long way from there!

I don't mean to imply that the existing designs are bad as far as they
> go. In each case, the functionality that has been built is good. But
> it's all focused, as it seems to me, on providing capabilities rather
> than on providing a way for users to manage a group of database
> servers using high-level primitives.

100% agree.

BDR tried to get part-way there, but has as many problems as solutions, and
to get that far it imposes a lot of restrictions. It's great for one set of
use cases but has to be used carefully and with a solid understanding.

Many of the limtiations and restrictions imposed by BDR are because of
limitations in the core server that make a smoother, more transparent
solution unfeasable. Like with DDL management, our issues with full table
rewrites, cluster-wide vs database-specific DDL, etc etc etc.

That higher-level stuff largely
> gets left to add-on tools, which I don't think is serving us
> particularly well.

+1

> Those add-on tools often find that the core
> support doesn't quite do everything they'd like it to do: that's why
> WAL-E and repmgr, for example, end up having to do some creative
> things to deliver certain features. We need to start thinking of
> groups of servers rather than individual servers as the unit of
> deployment.

Yes... but it's a long path there, and we'll need to progressively build
server infrastructure to make that posible.

There's also the issue that most companies who work in the PostgreSQL space
have their own tools and have their own interests to protect. We could
pretend that wasn't the case, but we'd still trip over the elephant we're
refusing to see.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Vladimir Sitnikov 2016-07-14 06:28:33 Re: One process per session lack of sharing
Previous Message Andres Freund 2016-07-14 06:06:07 Re: Reviewing freeze map code