Re: Serializable Isolation without blocking

From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Simon Riggs" <simon(at)2ndQuadrant(dot)com>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Serializable Isolation without blocking
Date: 2009-05-07 15:56:48
Message-ID: 4A02BE70.EE98.0025.0@wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndQuadrant(dot)com> wrote:

> It wouldn't be 692 lines of code

Agreed. The original implementation was in an MVCC database which
already supported full serializability using strict 2 phase locking
and used page level locks. Both of these made the implementation
simpler than it would be in PostgreSQL. (And that's not even
mentioning sub-transactions and distributed transactions!)

> and even if it were the impact of that
> code would be such that it would need to be optional

I was thinking perhaps a GUC to allow "traditional" behavior when
SERIALIZABLE is requested versus using snapshot isolation for
REPEATABLE READ and this new technique for SERIALIZABLE. Would that
be sane?

> If the use is optional, I would currently prefer the existing
> mechanism for implementing serialization, which is to serialize
> access directly using either a LOCK statement or an exclusive
> advisory lock.

I'm sure many will, particularly where the number of tables is less
than 100 and the number of queries which can be run concurrently is
only a thousand or two. Picking out the potential conflicts and
hand-coding serialization techniques becomes more feasible on a small
scale like that.

That said, there's a lot less room for mistakes here, once this new
technique is implemented and settled in. When I was discussing the
receipting and deposit scenario while trying to clarify the
documentation of current behavior, I received several suggestions from
respected members of this community for how that could be handled with
existing techniques which didn't, in fact, correct the problem. That
just points out to me how tricky it is to solve on an ad hoc basis, as
opposed to a more rigorous technique like the one described in the
paper.

The only suggested fix which *did* work forced actual serialization of
all receipts as well as actual serialization of those with the deposit
report query. The beauty of this new technique is that there would
not be any blocking in the described scenario, and there would be a
rollback with serialization failure if (and only if) there was an
attempt to run the deposit report query while a transaction for a
receipt on the old date was still pending. I suspect that the
concurrency improvements of the new technique over existing safe
techniques would allow it to scale well, at least in our environment.

> It's clear that any new-theory solution will cost significantly more
> as the number of users increases, at least O(N^2), whereas simply
> waiting is only O(N), AFAICS.

I'm not following your reasoning on the O(N^2). Could you explain why
you think it would follow that curve?

> So it seems its use would require some thought and care and possibly
> further research to uncover areas of applicability in real usage.

Care -- of course. Real usage for serializable transactions -- well
known already. (Or are you just questioning performance here?)

> So for me, I would say we leave this be until the SQLStandard
> changes to recognise the additional mode.

It already recognizes this mode; it doesn't yet recognize snapshot
isolation (more's the pity).

> I don't see much advantage for us in breaking the ground on this
> feature and it will be costly to > implement, so is a good PhD
> project.

Apparently it's already been done as a PhD project -- by Michael
Cahill, against InnoDB.

-Kevin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2009-05-07 15:57:20 Re: [HACKERS] Re: BUG #4796: Recovery followed by backup creates unrecoverable WAL-file
Previous Message Simon Riggs 2009-05-07 15:29:32 Re: Patch to fix search_path defencies with pg_bench