From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | chenhj <chjischj(at)163(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: could not access status of transaction |
Date: | 2020-01-06 18:39:35 |
Message-ID: | CA+TgmoYc0cmQKd+ogi=BwRqwnQ21ooSEj0O84wKenxAiPzZT+Q@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, Jan 5, 2020 at 11:00 PM chenhj <chjischj(at)163(dot)com> wrote:
> According to above information, the flags of the heap page (163363) with the problem tuple (163363, 9) is 0x0001 (HAS_FREE_LINES), that is, ALL_VISIBLE is not set.
>
> However, according hexdump content of the corresponding vm file, that block(location is 9F88 + 6bit) has set VISIBILITYMAP_ALL_FROZEN and VISIBILITYMAP_ALL_VISIBLE flags. That is, the heap file and the vm file are inconsistent.
That's not supposed to happen, and represents data corruption. Your
previous report of a too-old xmin surviving in the heap is also
corruption. There is no guarantee that both problems have the same
cause, but suppose they do. One possibility is that a write to the
heap page may have gotten lost or undone. Suppose that, while this
page was in shared_buffers, VACUUM came through and froze it, setting
the bits in the VM and later truncating CLOG. Then, suppose that when
that page was evicted from shared_buffers, it didn't really get
written back to disk, or alternatively it did, but then later somehow
the old version reappeared. I think that would produce these symptoms.
I think that bad hardware could cause this, or running two copies of
the server on the same data files at the same time, or maybe some kind
of filesystem-related flakiness, especially if, for example, you are
using a network filesystem like NFS, or maybe a broken iSCSI stack.
There is also no reason it couldn't be a bug in PostgreSQL itself,
although if we lost page writes routinely somebody would surely have
noticed by now.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Pierre Ducroquet | 2020-01-06 18:57:40 | Re: [PATCH] fix a performance issue with multiple logical-decoding walsenders |
Previous Message | Tom Lane | 2020-01-06 18:27:47 | Re: Removing pg_pltemplate and creating "trustable" extensions |