Re: Exposing the Xact commit order to the user

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Jan Wieck <JanWieck(at)Yahoo(dot)com>
Cc: PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Exposing the Xact commit order to the user
Date: 2010-08-30 09:18:53
Message-ID: 1283159933.1800.1844.camel@ebony
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, 2010-05-23 at 16:21 -0400, Jan Wieck wrote:

> In some systems (data warehousing, replication), the order of commits is
> important, since that is the order in which changes have become visible.
> This information could theoretically be extracted from the WAL, but
> scanning the entire WAL just to extract this tidbit of information would
> be excruciatingly painful.

This idea had support from at least 6 hackers. I'm happy to add my own.

Can I suggest it is added as a hook, rather than argue about the details
too much? The main use case is in combination with external systems, so
that way we can maintain the relevant code with the system that cares
about it.

> CommitTransaction() inside of xact.c will call a function, that inserts
> a new record into this array. The operation will for most of the time be
> nothing than taking a spinlock and adding the record to shared memory.
> All the data for the record is readily available, does not require
> further locking and can be collected locally before taking the spinlock.
> The begin_timestamp is the transactions idea of CURRENT_TIMESTAMP, the
> commit_timestamp is what CommitTransaction() just decided to write into
> the WAL commit record and the total_rowcount is the sum of inserted,
> updated and deleted heap tuples during the transaction, which should be
> easily available from the statistics collector, unless row stats are
> disabled, in which case the datum would be zero.

Does this need to be called while in a critical section? Or can we wait
until after the actual marking of the commit before calling this?

> Checkpoint handling will call a function to flush the shared buffers.
> Together with this, the information from WAL records will be sufficient
> to recover this data (except for row counts) during crash recovery.

So it would need to work identically in recovery also?

These two values are not currently stored in the commit WAL record.

timestamptz xci_begin_timestamp
int64 xci_total_rowcount

Both of those seem optional, so I don't really want them added to WAL.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Training and Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2010-08-30 09:18:54 cost_hashjoin
Previous Message Simon Riggs 2010-08-30 09:18:49 Re: pg_subtrans keeps bloating up in the standby