| From: | surya poondla <suryapoondla4(at)gmail(dot)com> |
|---|---|
| To: | Tushar Takate <tushar11(dot)takate(at)gmail(dot)com> |
| Cc: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
| Subject: | Re: ERROR: found xmin 4133102167 from before relfrozenxid 4151440783 |
| Date: | 2025-10-28 21:34:53 |
| Message-ID: | CAOVWO5qZzv4Ee_Ar6u0QC0mFPvo1UxguePiOgsadsBp6Lc8pGQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-bugs |
Hi All,
I and Tushar work in the same org.
I was able to look at page 5821149 and gathered the below information
live_order_us_db=# select * from pg_visibility('public.order', 5821149);
all_visible | all_frozen | pd_all_visible
-------------+------------+---------------
t | t | f
(1 row)
live_order_us_db=# SELECT t_ctid, t_xmin, t_xmax, t_infomask,
t_infomask2 FROM heap_page_items(get_raw_page('public.order',
5821149)) WHERE lp = 5;
t_ctid | t_xmin | t_xmax | t_infomask | t_infomask2
-------------+----------------+-------------+----------------+-------------
(5821149,5) | 4133102167 | 0 | 2306 | 8
(1 row)
live_order_us_db=# SELECT t_ctid, raw_flags, combined_flags
FROM heap_page_items(get_raw_page('order', 5821149)),
LATERAL heap_tuple_infomask_flags(t_infomask, t_infomask2)
WHERE t_infomask IS NOT NULL OR t_infomask2 IS NOT NULL;
t_ctid | raw_flags
| combined_flags
--------------+-----------------------------------------------------------------------------------------------------------------+--------------------
(5821149,1) |
{HEAP_HASVARWIDTH,HEAP_XMIN_COMMITTED,HEAP_XMIN_INVALID,HEAP_XMAX_INVALID}.
| {HEAP_XMIN_FROZEN}
(5821149,2) |
{HEAP_HASVARWIDTH,HEAP_XMIN_COMMITTED,HEAP_XMIN_INVALID,HEAP_XMAX_INVALID}
| {HEAP_XMIN_FROZEN}
(5821149,3) |
{HEAP_HASVARWIDTH,HEAP_XMIN_COMMITTED,HEAP_XMIN_INVALID,HEAP_XMAX_INVALID}
| {HEAP_XMIN_FROZEN}
(5821149,4) |
{HEAP_HASVARWIDTH,HEAP_XMIN_COMMITTED,HEAP_XMIN_INVALID,HEAP_XMAX_INVALID,HEAP_UPDATED}
| {HEAP_XMIN_FROZEN}
(5821149,5) |
{HEAP_HASVARWIDTH,HEAP_XMIN_COMMITTED,HEAP_XMAX_INVALID}
| {}
(5821149,7) |
{HEAP_HASVARWIDTH,HEAP_XMIN_COMMITTED,HEAP_XMIN_INVALID,HEAP_XMAX_INVALID}
| {HEAP_XMIN_FROZEN}
(5821149,9) |
{HEAP_HASVARWIDTH,HEAP_XMIN_COMMITTED,HEAP_XMIN_INVALID,HEAP_XMAX_INVALID}
| {HEAP_XMIN_FROZEN}
(5821149,10) |
{HEAP_HASVARWIDTH,HEAP_XMIN_COMMITTED,HEAP_XMIN_INVALID,HEAP_XMAX_INVALID,HEAP_UPDATED}
| {HEAP_XMIN_FROZEN}
(5821149,11) |
{HEAP_HASVARWIDTH,HEAP_XMIN_COMMITTED,HEAP_XMIN_INVALID,HEAP_XMAX_INVALID}
| {HEAP_XMIN_FROZEN}
(5821149,13) |
{HEAP_HASVARWIDTH,HEAP_XMIN_COMMITTED,HEAP_XMIN_INVALID,HEAP_XMAX_INVALID,HEAP_UPDATED}
| {HEAP_XMIN_FROZEN}
(5821149,14) |
{HEAP_HASVARWIDTH,HEAP_XMIN_COMMITTED,HEAP_XMIN_INVALID,HEAP_XMAX_INVALID}
| {HEAP_XMIN_FROZEN}
(5821149,15) |
{HEAP_HASVARWIDTH,HEAP_XMIN_COMMITTED,HEAP_XMIN_INVALID,HEAP_XMAX_INVALID,HEAP_UPDATED}
| {HEAP_XMIN_FROZEN}
(5821149,16) |
{HEAP_HASVARWIDTH,HEAP_XMIN_COMMITTED,HEAP_XMIN_INVALID,HEAP_XMAX_INVALID,HEAP_UPDATED}
| {HEAP_XMIN_FROZEN}
(5821149,17) |
{HEAP_HASVARWIDTH,HEAP_XMIN_COMMITTED,HEAP_XMIN_INVALID,HEAP_XMAX_INVALID,HEAP_UPDATED}
| {HEAP_XMIN_FROZEN}
(5821149,18) |
{HEAP_HASVARWIDTH,HEAP_XMIN_COMMITTED,HEAP_XMIN_INVALID,HEAP_XMAX_INVALID,HEAP_UPDATED}
| {HEAP_XMIN_FROZEN}
(5821149,19) |
{HEAP_HASVARWIDTH,HEAP_XMIN_COMMITTED,HEAP_XMIN_INVALID,HEAP_XMAX_INVALID,HEAP_UPDATED}
| {HEAP_XMIN_FROZEN}
(5821149,20) |
{HEAP_HASVARWIDTH,HEAP_XMIN_COMMITTED,HEAP_XMIN_INVALID,HEAP_XMAX_INVALID,HEAP_UPDATED}
| {HEAP_XMIN_FROZEN}
(5821149,21) |
{HEAP_HASVARWIDTH,HEAP_XMIN_COMMITTED,HEAP_XMIN_INVALID,HEAP_XMAX_INVALID,HEAP_UPDATED}
| {HEAP_XMIN_FROZEN}
(5821149,22) |
{HEAP_HASVARWIDTH,HEAP_XMIN_COMMITTED,HEAP_XMIN_INVALID,HEAP_XMAX_INVALID}
| {HEAP_XMIN_FROZEN}
(19 rows)
As we can see for t_ctid (5821149, 5) the t_infomask decimal value is 2306
(or 0x902 in hex) which corresponds to the raw flags as
{HEAP_HASVARWIDTH,HEAP_XMIN_COMMITTED,HEAP_XMAX_INVALID} and this
particular tuple has somehow missed the freezing action but the other
tuples in the same page with different offsets (lp like 3, 4, 14, 15 ...)
are frozen.
Also the all_visible, all_frozen flags for page 5821149 show as true.We are
wondering what caused the (5821149, 5) tuple not get frozen and more
importantly how did the relfrozenxid (4151440783) get an value bigger than
the t_xmin (4133102167)?
- Surya
On Tue, Oct 28, 2025 at 2:17 PM Tushar Takate <tushar11(dot)takate(at)gmail(dot)com>
wrote:
> Hi Team,
>
> I was reviewing a couple of community threads in pgsql-bugs and
> pgsql-general, however, I was unable to determine whether this is a bug or
> actual corruption.
>
> *Details as below *
>
> PostgreSQL version: 15.12
> Disk type: RAID5
> OS: RHEL 8.10
>
>
> *Error/Issue : *
> vacuumdb: error: processing of database "live_order_us_db" failed: ERROR:
> found xmin 4133102167 from before relfrozenxid 4151440783
>
> 2025-09-11 02:29:58.888 UTC,,,2362287,,68c233e1.240hbf,1,,2025-09-11
> 02:28:49 UTC,122/46371006,0,ERROR,XX001,"found xmin 4133102167 from before
> relfrozenxid 4151440783",,,,,"while scanning block 5821149 offset 5 of
> relation ""public.order""
> 2025-09-11 02:40:50.361 UTC,"prod_user_ap","live_order_us_db",2375672,"
> 127.0.0.1:59344",68c2342b.243ff8,4,"VACUUM",2025-09-11 02:30:03
> UTC,169/38875732,0,ERROR,XX001,"found xmin 4133102167 from before
> relfrozenxid 4151440783",,,,,"while scanning block 5821149 offset 5 of
> relation ""public.order""","VACUUM (VERBOSE, ANALYZE)
> public.order;",,,"vacuumdb","client backend",,-5528190995457849841
>
>
> *One more thing/observation we saw in the PostgreSQL logs :*
>
> The following message consistently appeared once a day during the past week
>
> 2025-09-10 23:33:14.469 UTC,,,157915,,68c21a46.268fb,3,,2025-09-10
> 23:31:18 UTC,45/49119328,0,WARNING,01000,"page is not marked all-visible
> but visibility map bit is set in relation ""order"" page
> 5815453",,,,,"while scanning block 5815453 of relation
> ""public.order""",,,,"","autovacuum worker",,0
>
> What specific condition or scenario is triggering this PostgreSQL
> error? Can it be classified as a bug? If not, what’s a safe
> and efficient way to resolve it without relying on a dump and restore,
> particularly for large, mission-critical tables over 200GB?
>
> -
> Thanks & Regards,
>
> Tushar.
> LinkedIn : Tushar <https://www.linkedin.com/in/tushar-t-93660867/>
> My-Blogs : Tushar Blogspot <http://tushar-postgresql.blogspot.in/>
>
>
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tom Lane | 2025-10-28 23:07:34 | Re: [EXTERNAL]Re: BUG #19094: select statement on postgres 17 vs postgres 18 is returning different/duplicate results |
| Previous Message | Amit Langote | 2025-10-28 12:44:39 | Re: Segfault in RI UPDATE CASCADE on partitioned tables with LIKE+ATTACH child (attnum drift) |