Re: snapshot too old, configured by time

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kevin Grittner <kgrittn(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Thom Brown <thom(at)linux(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Steve Singer <steve(at)ssinger(dot)info>, Kevin Grittner <kgrittn(at)ymail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: snapshot too old, configured by time
Date: 2016-04-23 14:20:19
Message-ID: 20160423142019.GF5939@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Apr 23, 2016 at 12:48:08PM +0530, Amit Kapila wrote:
> On Sat, Apr 23, 2016 at 8:34 AM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> >
> > I kind of agreed with Tom about just aborting transactions that held
> > snapshots for too long, and liked the idea this could be set per
> > session, but the idea that we abort only if a backend actually touches
> > the old data is very nice.  I can see why the patch author worked hard
> > to do that.
> >
> > How does/did Oracle handle this?
> >
>
> IIRC then Oracle gives this error when the space in undo tablespace (aka
> rollback segment) is low.  When the rollback segment gets full, it overwrites
> the changed data which might be required by some old snapshot and when that old
> snapshot statement tries to access the data (which is already overwritten), it
> gets "snapshot too old" error.  Assuming there is enough space in rollback
> segment, Oracle seems to provide a way via Alter System set undo_retention =
> <time_in_secs>. 
>
> Now, if the above understanding of mine is correct, then I think the current
> implementation done by Kevin is closer to what Oracle provides.

But does the rollback only happen if the long-running Oracle transaction
tries to _access_ specific data that was in the undo segment, or _any_
data that potentially could have been in the undo segment? If the
later, it seems Kevin's approach is better because you would have to
actually need to access old data that was there to be canceled, not just
any data that could have been overwritten based on the xid.

Also, it seems we have similar behavior already in applying WAL on the
standby --- we delay WAL replay when there is a long-running
transaction. Once the time expires, we apply the WAL. Do we cancel the
long-running transaction at that time, or wait for the long-running
transaction to touch some WAL we just applied? If the former, does
Kevin's new code allow us to do the later?

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2016-04-23 14:41:18 Re: [PATCH] we have added support for box type in SP-GiST index
Previous Message Tom Lane 2016-04-23 14:12:03 Re: Support for N synchronous standby servers - take 2