RE: Is this a problem in GenericXLogFinish()?

From: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
To: 'Michael Paquier' <michael(at)paquier(dot)xyz>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Alexander Lakhin <exclusion(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: RE: Is this a problem in GenericXLogFinish()?
Date: 2024-02-05 04:29:57
Message-ID: TYCPR01MB12077D8D2D4111BD6DAB21B58F5472@TYCPR01MB12077.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dear Michael, Amit,

>
> Amit, this has been applied as of 861f86beea1c, and I got pinged about
> the fact this triggers inconsistencies because we always set the LSN
> of the write buffer (wbuf in _hash_freeovflpage) but
> XLogRegisterBuffer() would *not* be called when the two following
> conditions happen:
> - When xlrec.ntups <= 0.
> - When !xlrec.is_prim_bucket_same_wrt && !xlrec.is_prev_bucket_same_wrt
>
> And it seems to me that there is still a bug here: there should be no
> point in setting the LSN on the write buffer if we don't register it
> in WAL at all, no?

Thanks for pointing out, I agreed your saying. PSA the patch for diagnosing the
issue.

This patch can avoid the inconsistency due to the LSN setting and output a debug
LOG when we met such a case. I executed hash_index.sql and confirmed the log was
output [1]. This meant that current test has already had a workload which meets below
conditions:

- the overflow page has no tuples (xlrec.ntups is 0),
- to-be-written page - wbuf - is not the primary (xlrec.is_prim_bucket_same_wrt
is false), and
- to-be-written buffer is not next to the overflow page
(xlrec.is_prev_bucket_same_wrt is false)

So, I think my patch (after removing elog(...) part) can fix the issue. Thought?

[1]:
```
LOG: XXX: is_wbuf_registered: false
CONTEXT: while vacuuming index "hash_cleanup_index" of relation "public.hash_cleanup_heap"
STATEMENT: VACUUM hash_cleanup_heap;
```

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/

Attachment Content-Type Size
avoid_registration.patch application/octet-stream 1.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2024-02-05 05:05:21 Re: src/bin/pg_upgrade/t/004_subscription.pl test comment fix
Previous Message Zhijie Hou (Fujitsu) 2024-02-05 03:49:23 RE: Synchronizing slots from primary to standby