From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila(at)huawei(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: cheaper snapshots redux |
Date: | 2011-09-12 16:01:21 |
Message-ID: | CA+TgmoaQutxtcbMhTOAPXtOUhhjvAQb99uW_4N1C2aDPgLg3ig@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Sep 12, 2011 at 11:07 AM, Amit Kapila <amit(dot)kapila(at)huawei(dot)com> wrote:
>>If you know what transactions were running the last time a snapshot summary
>> was written and what >transactions have ended since then, you can work out
>> the new xmin on the fly. I have working >code for this and it's actually
>> quite simple.
>
> I believe one method to do same is as follows:
>
> Let us assume at some point of time the snapshot and completed XID list is
> somewhat as follows:
>
> Snapshot
>
> { Xmin – 5, Xip[] – 8 10 12, Xmax - 15 }
>
> Committed XIDS – 8, 10 , 12, 18, 20, 21
>
> So it means 16,17,19 are running transactions. So it will behave as follows:
>
> { Xmin – 16, Xmax – 21, Xip[] – 17,19 }
Yep, that's pretty much what it does, although xmax is actually
defined as the XID *following* the last one that ended, and I think
xmin needs to also be in xip, so in this case you'd actually end up
with xmin = 15, xmax = 22, xip = { 15, 16, 17, 19 }. But you've got
the basic idea of it.
> But if we do above way to calculate Xmin, we need to check in existing Xip
> array and committed Xid array to find Xmin. Won’t this cause reasonable time
> even though it is outside lock time if Xip and Xid are large.
Yes, Tom raised this concern earlier. I can't answer it for sure
without benchmarking, but clearly xip[] can't be allowed to get too
big.
>> Because GetSnapshotData() computes a new value for RecentGlobalXmin by
>> scanning the ProcArray. > This isn't costing a whole lot extra right now
>> because the xmin and xid fields are normally in > the same cache line, so
>> once you've looked at one of them it doesn't cost that much extra to
>> look at the other. If, on the other hand, you're not looking at (or even
>> locking) the
>> ProcArray, then doing so just to recomputed RecentGlobalXmin sucks.
>
> Yes, this is more time as compare to earlier, but if our approach to
> calculate Xmin is like above point, then one extra read outside lock should
> not matter. However if for above point approach is different then it will be
> costlier.
It's not one extra read - you'd have to look at every PGPROC. And it
is not outside a lock, either. You definitely need locking around
computing RecentGlobalXmin; see src/backend/access/transa/README. In
particular, if someone with proc->xmin = InvalidTransactionId is
taking a snapshot while you're computing RecentGlobalXmin, and then
stores a proc->xmin less than your newly-computed RecentGlobalXmin,
you've got a problem. That can't happen right now because no
transactions can commit while RecentGlobalXmin is being computed, but
the point here is precisely to allow those operations to (mostly) run
in parallel.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2011-09-12 16:42:05 | Re: Thinking about inventing MemoryContextSetParent |
Previous Message | Alexey Klyukin | 2011-09-12 15:53:44 | ALTER TABLE ONLY ...DROP CONSTRAINT is broken in HEAD. |