Quick Links

Re: SSI memory mitigation & false positive degradation

From:	"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To:	<pgsql-hackers(at)postgresql(dot)org>
Cc:	<drkp(at)csail(dot)mit(dot)edu>
Subject:	Re: SSI memory mitigation & false positive degradation
Date:	2010-12-27 22:36:42
Message-ID:	4D18C09A0200002500038BEA@gw.wicourts.gov
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

I wrote:

> Dan and I have now implemented most of the mitigation techniques
> ..., and I now feel confident I have a good grasp of how long each
> type of data is useful. (By useful I mean that to maintain data
> integrity without them it will be necessary to roll back some
> transactions which could have been allowed to commit had the data
> been available.)

I think that we need to be able keep something on the order of 10
times the max_connections number of SERIALIZABLEXACT structures in
shared memory for mitigation techniques to have a chance to work
well for most workloads. When that fills, we will start pushing the
oldest committed transactions out to make room, and fall back on the
graceful degradation. Heikki said in a previous post that he didn't
care if it 10 times or 100 times so long as it was finite and there
was graceful degradation after that, so it would appear that unless
someone else objects, this should fly. This structure fits (barely)
into 128 bytes when pointers are 64-bits.

> (1) An active read only transaction needs to be able to recognize
> when it is reading a tuple which was written by an overlapping
> transaction which has committed, but only if that read write
> transaction has a rw-conflict out to a transaction committed
> before the read only transaction acquired its snapshot.

> (2) An active read write transaction needs to be able to
> recognize when it is reading a tuple which was written by an
> overlapping transaction which has committed, and to know whether
> that committed transaction had any rw-conflict(s) out to
> previously committed transaction(s).

When committed transactions which have written to permanent tables
need to be pushed from the main structures, I think that keeping the
xid and the 64 bit commit seq no of the earliest rw-conflict out is
needed. Zero would mean no conflict. Such a list could address
these two needs. We could keep track of min and max xid values on
the list to avoid searches for values out of range. This seems like
a reasonable fit for the SLRU technique suggested by Heikki. Read
only transactions (declared or de facto) don't need to be included
in this list.

> (3) An active read write transaction needs to be able to detect
> when one of its writes conflicts with a predicate lock from an
> overlapping transaction which has committed. There's no need to
> know which one, but by the definition of a rw-conflict, it must
> have overlapped.

I think the cleanest way to handle this need is to have a "dummy"
SERIALIZABLEXACT structure sitting around to represent displaced
committed transactions and to move predicate locks to that as
transactions are pushed out of the primary structures. We would add
a commit seq no field to the predicate lock structure which would
only be used when locks were moved here. Duplicate locks (locks on
the same target) would collapse to a single lock and would use the
latest commit seq no. This is conceptually very similar to Heikki's
initial suggestion on this topic.

> (4) An active read write transaction needs to know that it had a
> rw-conflict out to a committed transaction. There's no need to
> know which one, but by the definition of a rw-conflict, it must
> have overlapped.
>
> (5) An active read write transaction needs to know that it had a
> rw-conflict in from a committed transaction. There's no need to
> know which one, but by the definition of a rw-conflict, it must
> have overlapped.

These two are easy -- we can define a couple more flag bits for
active transactions to check when determining whether a new
rw-conflict has created a dangerous structure which must be rolled
back.

Any comments welcome. Barring surprises, I start coding on this
tomorrow.

-Kevin

In response to

SSI memory mitigation & false positive degradation at 2010-12-26 19:40:15 from Kevin Grittner

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Kevin Grittner	2010-12-27 22:49:51	Re: estimating # of distinct values
Previous Message	David Fetter	2010-12-27 22:06:12	Re: "writable CTEs"