Re: BUG #16285: bt_metap fails with value is out of range for type integer

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Victor Yegorov <vyegorov(at)gmail(dot)com>
Cc: PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #16285: bt_metap fails with value is out of range for type integer
Date: 2020-03-06 01:46:02
Message-ID: CAH2-Wz=H83jZoAjgf85CLSP8teXy-=5gpNX3WRk2Y8gqenGc=g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Mon, Mar 2, 2020 at 3:19 PM Victor Yegorov <vyegorov(at)gmail(dot)com> wrote:
> This made me look at the stats, as this is a test copy of the DB upgraded to 12.2 recently. Per pg_stat_user_tables,
> table had never been vacuumed since upgrade, only analyzed.
> I've run vacuum manually and this made the issue go away.
>
> It is beyond my skills to try to find the real place for this issue now. I think we might be hitting some uninitialized value somewhere, perhaps?..

The issue is that bt_metap() was declared using the wrong data types
-- simple as that. An int4 cannot represent 2180413846 -- only a
uint32 (or a TransactionId) can. Technically this is not a recent
issue, since we got btm_root wrong right from the start. However, it
seems more likely that the relatively recently added oldest_xact field
will cause problems in practice. The fact that a manual VACUUM made
that go away for you (at least temporarily) is not surprising.

The declaration itself is wrong, so it is the declaration itself that
we must fix. I can't really see a way of fixing it without introducing
a new version of contrib/pageinspect. :-(

I propose the attached fix for the master branch only. This is a draft
patch. The patch changes the data types to more appropriate, wider
integer types. It follows the convention of using int8/bigint where
the natural data type to use at the C code level is actually uint32 --
pg_stat_statements does this with queryId (actually, we switched over
to 64-bit hashes some time later, but it worked that way before commit
cff440d3686).

The patch takes another idea from pg_stat_statements: Using a
tupledesc's natts field to determine the extension version in use, to
maintain compatibility when there are breaking changes to a function's
definition. I simply error out when bt_metap() is called using the
definition from an older version of the extension, unlike
pg_stat_statements. It isn't worth going to the trouble of preserving
a set of behaviors that are more or less broken anyway. Better to
error out in an obvious way. Especially given that contrib/pageinspect
is fundamentally a superuser-only extension, with a relatively small
user base.

Theoretically, I should also change bt_page_items() (and several
functions) to make its blkno In parameter of type int8/bigint (instead
of int4/integer). I haven't bothered with that, though.

--
Peter Geoghegan

Attachment Content-Type Size
v1-0001-pageinspect-Fix-bt_metap-column-types.patch application/x-patch 4.6 KB

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message David Zhang 2020-03-06 03:53:35 Re: BUG #16147: postgresql 12.1 (from homebrew) - pg_restore -h localhost --jobs=2 crashes
Previous Message Daniel Gustafsson 2020-03-05 12:11:30 Re: Wrong de translation for commands/tablecmds.c:5028