Re: BUG #15896: pg_upgrade from 10-or-earlier: TRAP: FailedAssertion(»!(metad->btm_version >= 3)«

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Christoph Berg <myon(at)debian(dot)org>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: BUG #15896: pg_upgrade from 10-or-earlier: TRAP: FailedAssertion(»!(metad->btm_version >= 3)«
Date: 2019-07-05 22:14:31
Message-ID: CAH2-Wzmj6pz98qZ6+Ro-=tHvyBJ6q0yxHV8QLOr6O0mE20Nw9Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Fri, Jul 5, 2019 at 8:49 AM Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
> > TRAP: FailedAssertion(»!(metad->btm_version >= 3)«, Datei:
> > »/build/postgresql-12-3URvLF/postgresql-12-12~beta2/build/../src/backend/access/nbtree/nbtpage.c«,
> > Zeile: 665)
>
> Seems that _bt_getrootheight is too optimistic about the metapage
> version it'll find. I suppose this could be handled by just not caching
> the metapage if it is of the old version ... or maybe by calling
> _bt_upgrademetapage().

The problem here predates v12 -- the call to _bt_cachemetadata() was
added to _bt_getrootheight() by commit 0a64b45152b, which went into
v11. My commit dd299df8189 added a new assertion that fails, but
that's just a symptom -- I changed the code in _bt_getrootheight() to
use a BTMetaPageData pointer to shared memory (i.e. a pointer to the
authoritative version), rather than using the newly-out-of-sync cached
version. It shouldn't be out-of-sync at all.

_bt_getrootheight() is mostly just something that exists for the
planner, so it has no business calling _bt_cachemetadata(), which will
"upgrade" the cached metadata image from version 2 to version 3 if it
happens to be on version 2. How can it be okay to upgrade the cached
version without also upgrading the on-disk/shared_buffers version?
This bug was hiding in plain sight.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2019-07-05 22:22:26 Re: DISCARD TEMP results in "ERROR: cache lookup failed for type 0"
Previous Message Tom Lane 2019-07-05 20:57:59 Re: VACUUM FULL results in deadlock