Re: I'd like to discuss scaleout at PGCon

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: MauMau <maumau307(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: I'd like to discuss scaleout at PGCon
Date: 2018-06-06 03:06:08
Message-ID: 20180606030608.GI1442@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jun 06, 2018 at 01:14:04AM +0900, MauMau wrote:
> I don't think an immediate server like the coordinators in XL is
> necessary. That extra hop can be eliminated by putting both the
> coordinator and the data node roles in the same server process. That
> is, the node to which an application connects communicates with other
> nodes only when it does not necessary data.

Yes, I agree with that. This was actually a concern I had over the
original XC design after a couple of years working on it. The less
nodes, the easier the HA, even if applying any PITR logic in N nodes
instead of N*2 nodes with 2PC checks and cleanups is far from trivial
either.. It happens that the code resulting in splitting coordinator
and datanode was simpler to maintain than merging both, at the cost of
operation maintenance and complexity in running the thing.

> Furthermore, an extra hop and double parsing/planning could matter for
> analytic queries, too. For example, SAP HANA boasts of scanning 1
> billion rows in one second. In HANA's scaleout architecture, an
> application can connect to any worker node and the node communicates
> with other nodes only when necessary (there's one special node called
> "master", but it manages the catalog and transactions; it's not an
> extra hop like the coordinator in XL). Vertica is an MPP analytics
> database, but it doesn't have a node like the coordinator, either. To
> achieve maximum performance for real-time queries, the scaleout
> architecture should avoid an extra hop when possible.

Greenplum's orca planner (and Citus?) have such facilities if I recall
correctly, just mentioning that pushing down directly to remote nodes
compiled plans ready for execution exists here and there (that's not the
case of XC/XL). For queries whose planning time is way shorter than its
actual execution, like analytical work that would not matter much. But
not for OLTP and short transaction workloads.

>> Using a central coordinator also allows multi-node transaction
>> control, global deadlock detection etc..
>
> VoltDB does not have an always-pass hop like the coordinator in XL.

Greenplum uses also a single-coordinator, multi-datanode instance. That
looks similar, right?

> Our proprietary RDBMS named Symfoware, which is not based on
> PostgreSQL, also doesn't have an extra hop, and can handle distributed
> transactions and deadlock detection/resolution without any special
> node like GTM.

Interesting to know that. This is an area with difficult problems. At
the closer to merge with Postgres head, the more fun (?) you get into
trying to support new SQL features, and sometimes you finish with hard
ERRORs or extra GUC switches to prevent any kind of inconsistent
operations.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2018-06-06 03:35:11 Re: buildfarm vs code
Previous Message Amit Kapila 2018-06-06 02:59:01 Re: install <install_path> doesn't work on HEAD