Re: Freezing without write I/O

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Freezing without write I/O
Date: 2013-05-30 18:39:46
Message-ID: CA+TgmoZ6YEYfXQRi=YM5WWJ5raG9PKQpzDcim+3YJhFzyo3yrw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, May 30, 2013 at 9:33 AM, Heikki Linnakangas
<hlinnakangas(at)vmware(dot)com> wrote:
> The reason we have to freeze is that otherwise our 32-bit XIDs wrap around
> and become ambiguous. The obvious solution is to extend XIDs to 64 bits, but
> that would waste a lot space. The trick is to add a field to the page header
> indicating the 'epoch' of the XID, while keeping the XIDs in tuple header
> 32-bit wide (*).

Check.

> The other reason we freeze is to truncate the clog. But with 64-bit XIDs, we
> wouldn't actually need to change old XIDs on disk to FrozenXid. Instead, we
> could implicitly treat anything older than relfrozenxid as frozen.

Check.

> That's the basic idea. Vacuum freeze only needs to remove dead tuples, but
> doesn't need to dirty pages that contain no dead tuples.

Check.

> Since we're not storing 64-bit wide XIDs on every tuple, we'd still need to
> replace the XIDs with FrozenXid whenever the difference between the smallest
> and largest XID on a page exceeds 2^31. But that would only happen when
> you're updating the page, in which case the page is dirtied anyway, so it
> wouldn't cause any extra I/O.

It would cause some extra WAL activity, but it wouldn't dirty the page
an extra time.

> This would also be the first step in allowing the clog to grow larger than 2
> billion transactions, eliminating the need for anti-wraparound freezing
> altogether. You'd still want to truncate the clog eventually, but it would
> be nice to not be pressed against the wall with "run vacuum freeze now, or
> the system will shut down".

Interesting. That seems like a major advantage.

> (*) "Adding an epoch" is inaccurate, but I like to use that as my mental
> model. If you just add a 32-bit epoch field, then you cannot have xids from
> different epochs on the page, which would be a problem. In reality, you
> would store one 64-bit XID value in the page header, and use that as the
> "reference point" for all the 32-bit XIDs on the tuples. See existing
> convert_txid() function for how that works. Another method is to store the
> 32-bit xid values in tuple headers as offsets from the per-page 64-bit
> value, but then you'd always need to have the 64-bit value at hand when
> interpreting the XIDs, even if they're all recent.

As I see it, the main downsides of this approach are:

(1) It breaks binary compatibility (unless you do something to
provided for it, like put the epoch in the special space).

(2) It consumes 8 bytes per page. I think it would be possible to get
this down to say 5 bytes per page pretty easily; we'd simply decide
that the low-order 3 bytes of the reference XID must always be 0.
Possibly you could even do with 4 bytes, or 4 bytes plus some number
of extra bits.

(3) You still need to periodically scan the entire relation, or else
have a freeze map as Simon and Josh suggested.

The upsides of this approach as compared with what Andres and I are
proposing are:

(1) It provides a stepping stone towards allowing indefinite expansion
of CLOG, which is quite appealing as an alternative to a hard
shut-down.

(2) It doesn't place any particular requirements on PD_ALL_VISIBLE. I
don't personally find this much of a benefit as I want to keep
PD_ALL_VISIBLE, but I know Jeff and perhaps others disagree.

Random thought: Could you compute the reference XID based on the page
LSN? That would eliminate the storage overhead.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Merlin Moncure 2013-05-30 18:46:50 Re: Freezing without write I/O
Previous Message Josh Berkus 2013-05-30 17:26:21 Re: Freezing without write I/O