Re: [HACKERS] [PATCH] Provide 8-byte transaction IDs to

From: Hannu Krosing <hannu(at)skype(dot)net>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Marko Kreen <markokr(at)gmail(dot)com>, pgsql-patches(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [HACKERS] [PATCH] Provide 8-byte transaction IDs to
Date: 2006-07-26 21:16:33
Message-ID: 1153948593.2928.28.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Ühel kenal päeval, K, 2006-07-26 kell 13:35, kirjutas Bruce Momjian:
> I am sure you worked hard on this, but I don't see the use case,

The use case is any slony-like replication system or queueing system
which needs consistent means of knowing batches of transactions which
have finished during some period.

You can think of this as a core component for building slony that does
*not* break at 2G trx.

> nor
> have I heard people in the community requesting such functionality.

You will, once more Slony users reach 2billion trx limit and start
silently losing data. And find out a few weeks later.

> Perhaps pgfoundry would be a better place for this.

At least the part that manages epoch should be in core.

The rest can actually be on pgfoundry as a separate project, or inside
skytools/pgQ.

> ---------------------------------------------------------------------------
>
> Marko Kreen wrote:
> >
> > Intro
> > -----
> >
> > Following patch exports 8 byte txid and snapshot to user level
> > allowing its use in regular SQL. It is based on Slony-I xxid
> > module. It provides special 'snapshot' type for snapshot but
> > uses regular int8 for transaction ID's.
> >
> > Exported API
> > ------------
> >
> > Type: snapshot
> >
> > Functions:
> >
> > current_txid() returns int8
> > current_snapshot() returns snapshot
> > snapshot_xmin(snapshot) returns int8
> > snapshot_xmax(snapshot) returns int8
> > snapshot_active_list(snapshot) returns setof int8
> > snapshot_contains(snapshot, int8) returns bool
> > pg_sync_txid(int8) returns int8
> >
> > Operation
> > ---------
> >
> > Extension to 8-byte is done by keeping track of wraparound count
> > in pg_control. On every checkpoint, nextxid is compared to one
> > stored in pg_control. If value is smaller wraparound happened
> > and epoch is inreased.
> >
> > When long txid or snapshot is requested, pg_control is locked with
> > LW_SHARED for retrieving epoch value from it. The patch does not
> > affect core functionality in any other way.
> >
> > Backup/restore of txid data
> > ---------------------------
> >
> > Currently I made pg_dumpall output following statement:
> >
> > "SELECT pg_sync_txid(%d)", current_txid()
> >
> > then on target database, pg_sync_txid if it's current
> > (epoch + GetTopTransactionId()) are larger than given argument.
> > If not then it bumps epoch, until they are, thus guaranteeing that
> > new issued txid's are larger then in source database. If restored
> > into same database instance, nothing will happen.
> >
> >
> > Advantages of 8-byte txids
> > --------------------------
> >
> > * Indexes won't break silently. No need for mandatory periodic
> > truncate which may not happen for various reasons.
> > * Allows to keep values from different databases in one table/index.
> > * Ability to bring data into different server and continue there.
> >
> > Advantages in being in core
> > ---------------------------
> >
> > * Core code can guarantee that wraparound check happens in 2G transactions.
> > * Core code can update pg_control non-transactionally. Module
> > needs to operate inside user transaction when updating epoch
> > row, which bring various problems (READ COMMITTED vs. SERIALIZABLE,
> > long transactions, locking, etc).
> > * Core code has only one place where it needs to update, module
> > needs to have epoch table in each database.
> >
> > Todo, tothink
> > -------------
> >
> > * Flesh out the documentation. Probably needs some background.
> > * Better names for some functions?
> > * pg_sync_txid allows use of pg_dump for moveing database,
> > but also adds possibility to shoot in the foot by allowing
> > epoch wraparound to happen. Is "Don't do it then" enough?
> > * Currently txid keeps its own copy of nextxid in pg_control,
> > this makes clear data dependencies. Its possible to drop it
> > and use ->checkPointCopy->nextXid directly, thus saving 4 bytes.
> > * Should the pg_sync_txid() issued by pg_dump instead pg_dumpall?
> >
> > --
> > marko
> >
>
> [ Attachment, skipping... ]
>
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 2: Don't 'kill -9' the postmaster
>
--
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me: callto:hkrosing
Get Skype for free: http://www.skype.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2006-07-26 21:18:33 Re: xlogdump behaviour translating dropped relations
Previous Message Martijn van Oosterhout 2006-07-26 21:10:39 Re: GUC with units, details

Browse pgsql-patches by date

  From Date Subject
Next Message Bruce Momjian 2006-07-26 21:18:41 Re: [HACKERS] [PATCH] Provide 8-byte transaction IDs to
Previous Message Tom Lane 2006-07-26 21:03:23 Re: [HACKERS] [PATCH] Provide 8-byte transaction IDs to user level