On Thu, Nov 4, 2010 at 2:00 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Wed, Oct 20, 2010 at 8:11 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>>> I'm imagining that the kernel of a
>>>> snapshot is just a WAL position, ie the end of WAL as of the time you
>>>> take the snapshot (easy to get in O(1) time). Visibility tests then
>>>> reduce to "did this transaction commit with a WAL record located before
>>>> the specified position?".
>> I spent a bunch of time thinking about this, and I don't see any way
>> to get the memory usage requirements down to something reasonable.
>> The problem is that RecentGlobalXmin might be arbitrarily far back in
>> XID space, and you'll need to know the LSN of every commit from that
>> point forward; whereas the ProcArray requires only constant space.
> That's like arguing that clog is no good because it doesn't fit in
> constant space. ISTM it would be entirely practical to remember the
> commit LSN positions of every transaction back to RecentGlobalXmin,
> using a data structure similar to pg_subtrans --- in fact, it'd require
> exactly twice as much working space as pg_subtrans, ie 64 bits per XID
> instead of 32. Now, it might be that access contention problems would
> make this unworkable (I think pg_subtrans works largely because we don't
> have to access it often) but it's not something that can be dismissed
> on space grounds.
Maybe I didn't explain that very well. The point is not so much how
much memory you're using in an absolute sense as how much of it you
have to look at to construct a snapshot. If you store a giant array
indexed by XID whose value is an LSN, you have to read a potentially
unbounded number of entries from that array. You can either make a
single read through the relevant portion of the array (snapshot xmin
to snapshot xmax) or you can check each XID as you see it and try to
build up a local cache, but either way there's no fixed limit on how
many bytes must be read from the shared data structure. That compares
unfavorably with the current design, where you do a one-time read of a
bounded amount of data and you're done. I suspect your theory about
pg_subtrans is correct.
> [ thinks for a bit... ] But actually this probably ends up being a
> wash or a loss as far as contention goes. We're talking about a data
> structure that has to be updated during each commit, and read pretty
> frequently, and it's not obvious how that's any better than getting
> commit info from the ProcArray. Although neither commit nor reading
> would require a *global* lock, so maybe there's a way ...
Well, if we're talking about the "giant XID array" design, or some
variant of that, I would expect that nearly all of the contention
would be on the last page or two, so I don't think it would be much
better than a global lock. You might be able to get around that by
not using an LWLock, and instead using something like lock xchg or
LL/SC to atomatically update entries, but I'm not sure there's a
portable way to do such operations on anything larger than a 4-byte
word. At any rate, I think the problem described in the preceding
paragraph is the more serious one.
The Enterprise PostgreSQL Company
In response to
pgsql-hackers by date
|Next:||From: Alvaro Herrera||Date: 2010-11-04 21:55:36|
|Subject: Re: ALTER OBJECT any_name SET SCHEMA name|
|Previous:||From: Tom Lane||Date: 2010-11-04 21:00:23|
|Subject: Re: lazy snapshots? |