On Aug 22, 2011, at 4:25 PM, Robert Haas wrote:
> What I'm thinking about
> instead is using a ring buffer with three pointers: a start pointer, a
> stop pointer, and a write pointer. When a transaction ends, we
> advance the write pointer, write the XIDs or a whole new snapshot into
> the buffer, and then advance the stop pointer. If we wrote a whole
> new snapshot, we advance the start pointer to the beginning of the
> data we just wrote.
> Someone who wants to take a snapshot must read the data between the
> start and stop pointers, and must then check that the write pointer
> hasn't advanced so far in the meantime that the data they read might
> have been overwritten before they finished reading it. Obviously,
> that's a little risky, since we'll have to do the whole thing over if
> a wraparound occurs, but if the ring buffer is large enough it
> shouldn't happen very often. And a typical snapshot is pretty small
> unless massive numbers of subxids are in use, so it seems like it
> might not be too bad. Of course, it's pretty hard to know for sure
> without coding it up and testing it.
Something that would be really nice to fix is our reliance on a fixed size of shared memory, and I'm wondering if this could be an opportunity to start in a new direction. My thought is that we could maintain two distinct shared memory snapshots and alternate between them. That would allow us to actually resize them as needed. We would still need something like what you suggest to allow for adding to the list without locking, but with this scheme we wouldn't need to worry about extra locking when taking a snapshot since we'd be doing that in a new segment that no one is using yet.
The downside is such a scheme does add non-trivial complexity on top of what you proposed. I suspect it would be much better if we had a separate mechanism for dealing with shared memory requirements (shalloc?). But if it's just not practical to make a generic shared memory manager it would be good to start thinking about ways we can work around fixed shared memory size issues.
Jim C. Nasby, Database Architect jim(at)nasby(dot)net
512.569.9461 (cell) http://jim.nasby.net
In response to
pgsql-hackers by date
|Next:||From: Robert Haas||Date: 2011-08-22 23:22:47|
|Subject: Re: cheaper snapshots redux|
|Previous:||From: Robert Haas||Date: 2011-08-22 22:19:53|
|Subject: Re: 9.1rc1: TRAP: FailedAssertion("!(item_width > 0)",
File: "costsize.c", Line: 3274)|