Quick Links

Re: SSI patch version 14

From:	"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To:	"Jeff Davis" <pgsql(at)j-davis(dot)com>
Cc:	<pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SSI patch version 14
Date:	2011-01-25 17:17:06
Message-ID:	4D3EB1320200002500039C53@gw.wicourts.gov
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Jeff Davis <pgsql(at)j-davis(dot)com> wrote:

> For starters, the above structures require unlimited memory, while
> we have fixed amounts of memory. The predicate locks are
> referenced by a global shared hash table as well as per-process
> SHMQueues. It can adapt memory usage downward in three ways:
> * increase lock granularity -- e.g. change N page locks into a
> table lock
> * "summarization" -- fold multiple locks on the same object
> across many old committed transactions into a single lock
> * use the SLRU

Those last two are related -- the summarization process takes the
oldest committed-but-still-significant transaction and does two
things with it:

(1) We move predicate locks to the "dummy" transaction, kept just
for this purpose. We combine locks against the same lock target,
using the more recent commit point to determine when the resulting
lock can be removed.

(2) We use SLRU to keep track of the top level xid of the old
committed transaction, and the earliest commit point of any
transactions to which it had an out-conflict.

The above reduces the information available to avoid serialization
failure in certain corner cases, and is more expensive to access
than the other structures, but it keeps us running in pessimal
circumstances, even if at a degraded level of performance.

> The heavy use of SHMQueue is quite reasonable, but for some reason
> I find the API very difficult to read. I think it would help (at
> least me) quite a lot to annotate (i.e. with a comment in the
> struct) the various lists and links with the types of data being
> held.

We've tried to comment enough, but when you have your head buried in
code you don't always recognize how mysterious something can look.
Can you suggest some particular places where more comments would be
helpful?

> The actual checking of conflicts isn't quite so simple, either,
> because we want to detect problems before the victim transaction
> has done too much work. So, checking is done along the way in
> several places, and it's a little difficult to follow exactly how
> those smaller checks add up to the overall serialization-anomaly
> check (the third point in my simple model).
>
> There are also optimizations around transactions declared READ
> ONLY. Some of these are a little difficult to follow as well, and
> I haven't followed them all.

Yeah. After things were basically working correctly, in terms of
not allowing any anomalies, we still had a lot of false positives.
Checks around the order and timing of commits, as well as read-only
status, helped bring these down. The infamous receipting example
was my main guide here. There are 210 permutations in the way the
statements can be interleaved, of which only six can result in
anomalies. At first we only managed to commit a couple dozen
permutations. As we added code to cover optimizations for various
special cases the false positive rate dropped a little at a time,
until that test hit 204 commits, six rollbacks. Although, all the
tests in the dcheck target are useful -- if I made a mistake in
implementing an optimization there would sometimes be just one or
two of the other tests which would fail. Looking at which
permutations got it right and which didn't was a really good way to
figure out what I did wrong.

> There is READ ONLY DEFERRABLE, which is a nice feature that waits
> for a "safe" snapshot, so that the transaction will never be
> aborted.

*And* will not contribute to causing any other transaction to be
rolled back, *and* dodges the overhead of predicate locking and
conflict checking. Glad you like it! ;-)

> Now, on to my code comments (non-exhaustive):
>
> * I see that you use a union as well as using "outLinks" to also
> be a free list. I suppose you did this to conserve shared memory
> space, right?

Yeah, we tried to conserve shared memory space where we could do so
without hurting performance. Some of it gets to be a little bit-
twiddly, but it seemed like a good idea at the time. Does any of it
seem like it creates a confusion factor which isn't worth it
compared to the shared memory savings?

> * Still some compiler warnings:
> twophase.c: In function *FinishPreparedTransaction*:
> twophase.c:1360: warning: implicit declaration of function
> *PredicateLockTwoPhaseFinish*

Ouch! That could cause bugs, since the implicit declaration didn't
actually *match* the actual definition. Don't know how we missed
that. I strongly recommend that anyone who wants to test 2PC with
the patch add this commit to it:

http://git.postgresql.org/gitweb?p=users/kgrittn/postgres.git;a=commitdiff;h=8e020d97bc7b8c72731688515b6d895f7e243e27

> * You have a function called CreatePredTran. We are already using
> "Transaction" and "Xact" -- we probably don't need "Tran" as well.

OK. Will rename if you like. Since that creates the PredTran
structure, I assume you want that renamed, too?

> * HS error has a typo:
> "ERROR: cannot set transaction isolationn level to serializable
> during recovery"

Will fix.

> I'll keep working on this patch. I hope I can be of some help
> getting this committed, because I'm looking forward to this
> feature. And I certainly think that you and Dan have applied the
> amount of planning, documentation, and careful implementation
> necessary for a feature like this.

Thanks much! This effort was driven, for my part, by the needs of
my employer; but I have to admit it was kinda fun to do some serious
coding on innovative ideas again. It's been a while. I'm ready to
kick back and party a bit when this gets done, though. ;-)

-Kevin

In response to

Re: SSI patch version 14 at 2011-01-25 09:07:39 from Jeff Davis

Responses

Re: SSI patch version 14 at 2011-01-25 18:37:32 from Jeff Davis

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2011-01-25 17:30:58	Re: a regression
Previous Message	marcin mank	2011-01-25 16:56:28	Re: a regression