Quick Links

Re: eXtensible Transaction Manager API

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: eXtensible Transaction Manager API
Date:	2015-12-01 14:19:19
Message-ID:	20151201141919.GA1297@momjian.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, Nov 17, 2015 at 12:48:38PM -0500, Robert Haas wrote:
> > At this time, the number of round trips needed particularly for READ
> > COMMITTED transactions that need a new snapshot for each query was
> > really a performance killer. We used DBT-1 (TPC-W) which is less
> > OLTP-like than DBT-2 (TPC-C), still with DBT-1 the scalability limit
> > was quickly reached with 10-20 nodes..
>
> Yeah. I think this merits a good bit of thought. Superficially, at
> least, it seems that every time you need a snapshot - which in the
> case of READ COMMITTED is for every SQL statement - you need a network
> roundtrip to the snapshot server. If multiple backends request a
> snapshot in very quick succession, you might be able to do a sort of
> "group commit" thing where you send a single request to the server and
> they all use the resulting snapshot, but it seems hard to get very far
> with such optimization. For example, if backend 1 sends a snapshot
> request and backend 2 then realizes that it also needs a snapshot, it
> can't just wait for the reply from backend 1 and use that one. The
> user might have committed a transaction someplace else and then kicked
> off a transaction on backend 2 afterward, expecting it to see the work
> committed earlier. But the snapshot returned to backend 1 might have
> been taken before that. So, all in all, this seems rather crippling.
>
> Things are better if the system has a single coordinator node that is
> also the arbiter of commits and snapshots. Then, it can always take a
> snapshot locally with no network roundtrip, and when it reaches out to
> a shard, it can pass along the snapshot information with the SQL query
> (or query plan) it has to send anyway. But then the single
> coordinator seems like it becomes a bottleneck. As soon as you have
> multiple coordinators, one of them has got to be the arbiter of global
> ordering, and now all of the other coordinators have to talk to it
> constantly.

I think the performance benefits of having a single coordinator are
going to require us to implement different snapshot/transaction code
paths for single coordinator and multi-coordinator cases. :-( That is,
people who can get by with only a single coordinator are going to want
to do that to avoid the overhead of multiple coordinators, while those
using multiple coordinators are going to have to live with that
overhead.

I think an open question is what workloads can use a single coordinator?
Read-only? Long queries? Are those cases also ones where the
snapshot/transaction overhead is negligible, meaning we don't need the
single coordinator optimizations?

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Roman grave inscription +

In response to

Re: eXtensible Transaction Manager API at 2015-11-17 17:48:38 from Robert Haas

Responses

Re: eXtensible Transaction Manager API at 2015-12-03 22:41:49 from Robert Haas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Alexander Korotkov	2015-12-01 14:27:10	Re: Use pg_rewind when target timeline was switched
Previous Message	Artur Zakirov	2015-12-01 13:54:37	Re: [PROPOSAL] Improvements of Hunspell dictionaries support