Re: Nested transactions: low level stuff

From: Manfred Koizar <mkoi-pg(at)aon(dot)at>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Nested transactions: low level stuff
Date: 2003-03-19 17:24:06
Message-ID: k57h7vcerkt41iftv0ot5gohqhbm6orcng@4ax.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 19 Mar 2003 11:18:38 -0500, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
wrote:
>Manfred Koizar <mkoi-pg(at)aon(dot)at> writes:
>> If we set XMIN/MAX_IS_COMMITTED in a tuple header, we have to replace
>> a sub-transaction xid in xmin or xmax respectively with the
>> main-transaction xid at the same time. Otherwise we'd have to look
>> for the main xid, whenever a tuple is touched.
>
>This worries me --- it changes a safe operation (OR'ing in a commit bit)
>into an unsafe one that requires exclusive lock on the page containing
>the tuple.

[Only talking about xmin here, but everything refers to xmax as well.]
I was hoping we could set xmin atomically without holding a lock. If
we can, we first set xmin to the main xid. The new state is still
consistent; now it looks as if the change has been made directly by
the main transaction and not by one of its subtransactions, which is
ok after the main transaction has committed (we are only talking about
cases where all interesting transactions have committed). As a second
step we update the commit bit which is as safe as it is now.

I see no concurrency problems. If two or more backends visit the same
tuple, they either write the same value to the same position which
doesn't hurt, or one sees the other's changes which is a good thing.

So this boils down to whether setting the value of a properly aligned
32 bit variable in shared memory is an atomic operation on all
supported platforms. I don't know enough about compilers to answer
this question.

>I'm also concerned that we'd now need a WAL entry to record
>the xid change

If the sequence is "first update xmin, then set the commit bit", we
never have an inconsistent state. And if the change is lost, it can
be redone by the next backend visiting the tuple. So I think we don't
need a WAL entry.

> (are we dependent on this change occurring for correctness?
>or is it only performance?)

The latter.

>Perhaps it would be better to leave the tuple-commit bit unset until we
>have been able to change the clog state to 01 ("committed to everyone").

At least we can fall back to this, if we can't find out how to update
the xid safely.

Servus
Manfred

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jason Hihn 2003-03-19 17:42:49 Re: PostgreSQL flamage on Slashdot
Previous Message Josh Berkus 2003-03-19 17:20:20 Re: PostgreSQL flamage on Slashdot