Re: SSI patch version 14

From: Dan Ports <drkp(at)csail(dot)mit(dot)edu>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: SSI patch version 14
Date: 2011-01-29 04:36:40
Message-ID: 20110129043640.GN57071@csail.mit.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 27, 2011 at 09:18:23AM -0800, Jeff Davis wrote:
> On Tue, 2011-01-25 at 05:57 -0500, Dan Ports wrote:
> > This summary is right on. I would add one additional detail or
> > clarification to the last point, which is that rather than checking for
> > a cycle, we're checking for a transaction with both "in" and "out"
> > conflicts, which every cycle must contain.
>
> To clarify, this means that it will get some false positives, right?

Yes, this is correct.

> Is there a reason we can't check for a real cycle, which would let T1
> succeed?

I talked today with someone who experimented with doing exactly that in
MySQL as part of a research project (Perfectly Serializable Snapshot
Isolation, paper forthcoming in ICDE)

It confirmed my intuition that this is possible but not as
straightforward as it sounds. Some complications I thought of in
adapting that to what we're doing:

1) it requires tracking all edges in the serialization graph; besides
the rw-conflicts we track there are also the more mundane
rw-dependencies (if T2 sees an update made by T1, then T2 has to
come after T1) and ww-dependencies (if T2's write modifies a tuple
created by T1, then T2 has to come after T1).

We are taking advantage of the fact that any cycle must have two
adjacent rw-conflict edges, but the rest of the cycle can be
composed of other types. It would certainly be possible to track the
others, but it would add a bit more complexity and use more memory.

2) it requires doing a depth-first search to check for a cycle, which
is more expensive than just detecting two edges. That seems OK if
you only want to check it on commit, but maybe not if you want to
detect serialization failures at the time of the conflicting
read/write (as we do).

3) it doesn't seem to interact very well with our memory mitigation
efforts, wherein we discard lots of data about committed
transactions that we know we won't need. If we were checking for
cycles, we'd need to keep more of it. I suspect summarization (when
we combine two transactions' predicate locks to save memory) would
also cause problems.

None of these seem particularly insurmountable, but they suggest it
isn't a clear win to try to find an entire cycle.

Dan

--
Dan R. K. Ports MIT CSAIL http://drkp.net/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2011-01-29 07:10:19 Re: Include WAL in base backup
Previous Message Bruce Momjian 2011-01-29 04:07:00 Re: pg_upgrade fails for non-postgres user