From: | Bruce Momjian <bruce(at)momjian(dot)us> |
---|---|
To: | Craig Ringer <craig(at)2ndquadrant(dot)com> |
Cc: | Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Kevin Grittner <kgrittn(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Alfred Perlstein <alfred(at)freebsd(dot)org>, Geoff Winkless <pgsqladmin(at)geoff(dot)dj>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Why we lost Uber as a user |
Date: | 2016-08-17 13:35:35 |
Message-ID: | 20160817133535.GA4293@momjian.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Aug 17, 2016 at 01:27:18PM +0800, Craig Ringer wrote:
> It's really bugging me that people are talking about "statement based"
> replication in MySQL as if it's just sending SQL text around. MySQL's statemnet
> based replication is a lot smarter than that, and in the
> actually-works-properly form it's a hybrid of row and statement based
> replication ("MIXED" mode). As I understand it it lobs around something closer
> to parsetrees with some values pre-computed rather than SQL text where
> possible. It stores some computed values of volatile functions in the binlog
> and reads them from there rather than computing them again when running the
> statement on replicas, which is why AUTO_INCREMENT etc works. It also falls
> back to row based replication where necessary for correctness. Even then it has
> a significant list of caveats, but it's pretty damn impressive. I didn't
> realise how clever the hybrid system was until recently.
>
> I can see it being desirable to do something like that eventually as an
> optimisation to logical decoding based replication. Where we can show that the
> statement is safe or make it safe by doing things like evaluating and
> substituting volatile function calls, xlog a modified parsetree with oids
> changed to qualified object names etc, send that when decoding, and execute
> that on the downstream(s). If there's something we can't show to be safe then
> replay the logical rows instead. That's way down the track though; I think it's
> more important to focus on completing logical row-based replication to the
> point where we handle table rewrites seamlessly and it "just works" first.
That was very interesting, and good to know. I assume it also covers
concurrent activity issues which I wrote about in this thread, e.g.
> I saw from the Uber article that they weren't going to per-row logical
> replication but _statement_ replication, which is very hard to do
> because typical SQL doesn't record what concurrent transactions
> committed before a new statement's transaction snapshot is taken, and
> doesn't record lock order for row updates blocked by concurrent activity
> --- both of which affect the final result from the query.
I assume they can do SQL-level replication when there is no other
concurrent activity on the table, and row-based in other cases?
--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2016-08-17 13:40:41 | Re: Use pread and pwrite instead of lseek + write and read |
Previous Message | Ryan Murphy | 2016-08-17 13:28:22 | Re: Patch: initdb: "'" for QUOTE_PATH (non-windows) |