From: | "Ed L(dot)" <pgsql(at)bluepolka(dot)net> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: 32/64-bit transaction IDs? |
Date: | 2003-03-22 19:57:01 |
Message-ID: | 200303221257.01205.pgsql@bluepolka.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Saturday March 22 2003 12:00, Tom Lane wrote:
> "Ed L." <pgsql(at)bluepolka(dot)net> writes:
> > On Saturday March 22 2003 11:29, Tom Lane wrote:
> >> I think this last part is wrong. It shouldn't be using the xid as
> >> part of the ordering, only the sequence value.
> >
> > Why not? How would you replay them on the slave in the same
> > transaction groupings and order that occurred on the master?
>
> Why not is easy:
>
> begin xact 1;
> update tuple X;
> ...
> begin xact 2;
> update tuple Y;
> commit;
> ...
> update tuple Y;
> commit;
>
> (Note that this is only possible in read-committed mode, else xact 1
> wouldn't be allowed to update tuple Y like this.) Here, you must
> replay xact 1's update of Y after xact 2's update of Y, else you'll
> get the wrong final state on the slave. On the other hand, it really
> does not matter whether you replay the tuple X update before or after
> you replay xact 2, because xact 2 didn't touch tuple X.
>
> If the existing DBmirror code is sorting as you say, then it will fail
> in this scenario --- unless it always manages to execute a propagation
> step in between the commits of xacts 2 and 1, which doesn't seem very
> trustworthy.
Well, I'm not absolutely certain, but I think this problem may indeed exist
in dbmirror. If I'm reading it correctly, dbmirror basically has the
following:
create table xact_queue (xid int, seqid serial, ...);
create table tuple_queue (seqid int, data, ...);
The dbmirror trigger does this:
myXid = GetCurrentTransactionId();
insert into xact_queue (myXid, nextval(seqid_seq));
insert into tuple_queue (seqid, data, ...) values (currval(seqid_seq), ...);
The slave then grabs all queued xids in order of the max seqid within each
transaction. Essentially,
SELECT xid, MAX(seqid)
FROM xact_queue
GROUP BY xid
ORDER BY MAX(seqid);
In your scenario it would order them xact1, then xact2, since xact 1's
update of Y would have the max seqid. For each xact, it replays the tuples
for that xact in seqid order.
SELECT t.seqid, t.data, ...
FROM tuple_queue t, xact_queue x
WHERE t.seqid = x.seqid
AND x.xid = $XID
ORDER BY t.seqid;
So the actual replay order would be
xact1: update X
xact1: update Y
xact2: update Y
leading to slave inconsistency.
> What I'm envisioning is that you should just send updates in the order
> of their insertion sequence numbers and *not* try to force them into
> transactional grouping. ...
Very good. Makes perfect sense to me now. That also apparently obviates
the need for 64-bit transactions since the sequence can be a BIGINT.
Thanks,
Ed
From | Date | Subject | |
---|---|---|---|
Next Message | Ones Self | 2003-03-22 19:58:42 | Returning an array form a function |
Previous Message | Joe Conway | 2003-03-22 19:12:53 | Re: table function: limit, offset, order |