Re: Proposal: Snapshot cloning

From: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Hannu Krosing" <hannu(at)skype(dot)net>, "Jan Wieck" <JanWieck(at)Yahoo(dot)com>, "PostgreSQL Development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal: Snapshot cloning
Date: 2007-01-26 17:22:34
Message-ID: 1169832155.3772.413.camel@silverbirch.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 2007-01-26 at 11:36 -0500, Tom Lane wrote:
> "Simon Riggs" <simon(at)2ndquadrant(dot)com> writes:
> > No, that would break MVCC. But we may have done lots of updates/deletes
> > that are *not* visible to any Snapshot, yet are not yet removable
> > because they are higher than OldestXmin but we don't know that because
> > previously the Snapshot details were not available. ISTM that this
> > proposal is a way of making the Snapshot limits publicly available so
> > that they can be used by VACUUM.
>
> Certainly not, unless you intend that *every* snapshot *must* be
> published, which is an overhead up with which we will not put.

Agreed, but that's the general case problem.

What I was hoping was that this would provide a mechanism for long
running transactions (LRTs) to publish their min/max Xids. Then if all
backends publish the minimum Xid of any Snapshot they have generated in
the proc array, we'd be able to decide if there are any large holes in
the global set of Snapshots. As a general case that's hard to evaluate,
but in the common case of a lone LRT and all the rest short duration
transactions you can end up with a gap of 250,000+ transactions opening
up between the two. It would be fairly easy to have VACUUM check for
large "visibility gaps" between groups of transactions and then use that
to improve its effectiveness in the presence of LRTs.

Theoretically we have to keep the chain of intermediate updates around
so it can be traversed by the old transaction, but in practical terms
traversing a long chain of updates isn't sensible. Serializable LRTs
will never traverse the chain anyway (that's a serializability error),
but there are some special cases to consider, hence my mentioning an
unresolved problem previously.

We'd need to be much more careful about the way Snapshots are managed,
so we can be certain that we take them all into account.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Gregory Stark 2007-01-26 17:29:23 Re: Proposal: Snapshot cloning
Previous Message Gregory Stark 2007-01-26 17:10:01 Recursive query syntax ambiguity