Re: cheaper snapshots

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Ants Aasma <ants(dot)aasma(at)eesti(dot)ee>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: cheaper snapshots
Date: 2011-07-29 00:14:53
Message-ID: CA+TgmoYhrXsY1hUcKYSxLDzq-NX6ioK5EMBQ13umP+EbQnVM+A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jul 28, 2011 at 7:54 PM, Ants Aasma <ants(dot)aasma(at)eesti(dot)ee> wrote:
> On Thu, Jul 28, 2011 at 11:54 PM, Kevin Grittner
> <Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
>> (4)  We communicate acceptable snapshots to the replica to make the
>> order of visibility visibility match the master even when that
>> doesn't match the order that transactions returned from commit.
>
> I wonder if some interpretation of 2 phase commit could make Robert's
> original suggestion implement this.
>
> On the master the commit sequence would look something like:
> 1. Insert commit record to the WAL
> 2. Wait for replication
> 3. Get a commit seq nr and mark XIDs visible
> 4. WAL log the seq nr
> 5. Return success to client
>
> When replaying:
> * When replaying commit record, do everything but make
>  the tx visible.
> * When replaying the commit sequence number
>    if there is a gap between last visible commit and current:
>      insert the commit sequence nr. to list of waiting commits.
>    else:
>      mark current and all directly following waiting tx's visible
>
> This would give consistent visibility order on master and slave. Robert
> is right that this would undesirably increase WAL traffic. Delaying this
> traffic would undesirably increase replay lag between master and slave.
> But it seems to me that this could be an optional WAL level on top of
> hot_standby that would only be enabled if consistent visibility on
> slaves is desired.

I think you nailed it.

An additional point to think about: if we were willing to insist on
streaming replication, we could send the commit sequence numbers via a
side channel rather than writing them to WAL, which would be a lot
cheaper. That might even be a reasonable thing to do, because if
you're doing log shipping, this is all going to be super-not-real-time
anyway. OTOH, I know we don't want to make WAL shipping anything less
than a first class citizen, so maybe not.

At any rate, we may be getting a little sidetracked here from the
original point of the thread, which was how to make snapshot-taking
cheaper. Maybe there's some tie-in to when transactions become
visible, but I think it's pretty weak. The existing system could be
hacked up to avoid making transactions visible out of LSN order, and
the system I proposed could make them visible either in LSN order or
do the same thing we do now. They are basically independent problems,
AFAICS.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-07-29 00:15:06 Re: cheaper snapshots
Previous Message Ants Aasma 2011-07-29 00:12:07 Re: cheaper snapshots