pgsql: Fix a couple of bugs in MultiXactId freezing

From: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: pgsql-committers(at)postgresql(dot)org
Subject: pgsql: Fix a couple of bugs in MultiXactId freezing
Date: 2013-11-30 01:06:09
Message-ID: E1VmZ0b-0008Hb-PM@gemulon.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

Fix a couple of bugs in MultiXactId freezing

Both heap_freeze_tuple() and heap_tuple_needs_freeze() neglected to look
into a multixact to check the members against cutoff_xid. This means
that a very old Xid could survive hidden within a multi, possibly
outliving its CLOG storage. In the distant future, this would cause
clog lookup failures:
ERROR: could not access status of transaction 3883960912
DETAIL: Could not open file "pg_clog/0E78": No such file or directory.

This mostly was problematic when the updating transaction aborted, since
in that case the row wouldn't get pruned away earlier in vacuum and the
multixact could possibly survive for a long time. In many cases, data
that is inaccessible for this reason way can be brought back
heuristically.

As a second bug, heap_freeze_tuple() didn't properly handle multixacts
that need to be frozen according to cutoff_multi, but whose updater xid
is still alive. Instead of preserving the update Xid, it just set Xmax
invalid, which leads to both old and new tuple versions becoming
visible. This is pretty rare in practice, but a real threat
nonetheless. Existing corrupted rows, unfortunately, cannot be repaired
in an automated fashion.

Existing physical replicas might have already incorrectly frozen tuples
because of different behavior than in master, which might only become
apparent in the future once pg_multixact/ is truncated; it is
recommended that all clones be rebuilt after upgrading.

Following code analysis caused by bug report by J Smith in message
CADFUPgc5bmtv-yg9znxV-vcfkb+JPRqs7m2OesQXaM_4Z1JpdQ(at)mail(dot)gmail(dot)com
and privately by F-Secure.

Backpatch to 9.3, where freezing of MultiXactIds was introduced.

Analysis and patch by Andres Freund, with some tweaks by Álvaro.

Branch
------
REL9_3_STABLE

Details
-------
http://git.postgresql.org/pg/commitdiff/8e53ae025de90b8f7d935ce0eb4d0551178a4caf

Modified Files
--------------
src/backend/access/heap/heapam.c | 160 ++++++++++++++++++++++++++++----
src/backend/access/transam/multixact.c | 14 ++-
2 files changed, 151 insertions(+), 23 deletions(-)

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Alvaro Herrera 2013-11-30 01:06:10 pgsql: Truncate pg_multixact/'s contents during crash recovery
Previous Message Alvaro Herrera 2013-11-30 01:06:08 pgsql: Replace hardcoded 200000000 with autovacuum_freeze_max_age

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2013-11-30 01:15:18 Re: MultiXact truncation, startup et al.
Previous Message Tom Lane 2013-11-30 00:40:06 Re: PostgreSQL Service on Windows does not start. ~ "is not a valid Win32 application"