Re: FSM Corruption (was: Could not read block at end of the relation)

From: Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-bugs <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: FSM Corruption (was: Could not read block at end of the relation)
Date: 2024-03-07 11:21:24
Message-ID: 1970118.yKVeVyVuyW@aivenlaptop
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Le mercredi 6 mars 2024 19:42:28 CET, vous avez écrit :
> On Wed, Mar 06, 2024 at 10:31:17AM +0100, Ronan Dunklau wrote:
> > Le mardi 5 mars 2024, 00:05:03 CET Noah Misch a écrit :
> > > I would guess this one is more risky from a performance perspective,
> > > since
> > > we'd be adding to a hotter path under RelationGetBufferForTuple().
> > > Still,
> > > it's likely fine.
> >
> > I ended up implementing this in the attached patch. The idea is that we
> > detect if the FSM returns a page past the end of the relation, and ignore
> > it. In that case we will fallback through the extension mechanism.
> >
> > For the corrupted-FSM case it is not great performance wise, as we will
> > extend the relation in small steps every time we find a non existing
> > block in the FSM, until the actual relation size matches what is recorded
> > in the FSM. But since those seldom happen, I figured it was better to
> > keep the code really simple for a bugfix.
>
> That could be fine. Before your message, I assumed we would treat this like
> the case of a full page. That is, we'd call
> RecordAndGetPageWithFreeSpace(), which would remove the FSM entry and
> possibly return another. That could be problematic if another backend is
> extending the relation; we don't want to erase the extender's new FSM
> entry. I'm parking this topic for the moment.

I would argue this is fine, as corruptions don't happen often, and FSM is not
an exact thing anyway. If we record a block as full, it will just be unused
until the next vacuum adds it back to the FSM with a correct value.

> > I wanted to test the impact in terms of performance, and I thought about
> > the worst possible case for this.
>
> [test details]
>
> > I noticed no difference running with or without the patch, but maybe
> > someone else can try to run that or find another adversarial case ?
>
> The patch currently adds an lseek() whenever the FSM finds a block. I see
> relevant-looking posts from mailing list searches with subsets of these
> terms: RelationGetNumberOfBlocks lseek benchmark overhead. I bet we should
> reduce this cost add. Some ideas:
>
> - No need for a system call if the FSM-returned value is low enough with
> > respect to any prior RelationGetNumberOfBlocks() call.

I'm not sure what you mean by "low enough". What I chose to implement instead
is to rely on the cached value even when not in recovery for that particular
case. The reasoning being, as above, that if someone extended the relation
since the last cached value, worst case scenario is that we do not insert into
that old block. Whenever we detect a case like this, we just erase the FSM
entry to mark the block as full, so that it's not picked up again next time.

>
> - We'd be happy and efficient with a ReadBufferMode such that
> ReadBufferExtended() returns NULL after a 0-byte read, with all other
> errors handled normally. That sounds like the specification of
> RBM_NORMAL_NO_LOG. Today's implementation of RBM_NORMAL_NO_LOG isn't that,
> but perhaps one RBM could satisfy both sets of needs.

I'm not sure about this one, this seems to have far more reaching
consequences. This would mean we don't have any cooperation from the FSM, and
would need to implement error recovery scenarios in each caller. But I admit I
haven't looked too much into it, and focused instead in seeing if the current
approach can get us to an acceptable state.

>
> On Wed, Mar 06, 2024 at 12:08:38PM +0100, Ronan Dunklau wrote:
> > Subject: [PATCH 2/2] Detect invalid FSM when finding a block.
> >
> > Whenever the FSM points to a block past the end of the relation, detect
> > it and fallback to the relation extension mechanism.
> >
> > This may arise when a FPI for the FSM has been triggered, and we end up
> > WAL-logging a page of the FSM pointing to newly extended blocks in the
> > relation which have never been WAL-logged.
> > ---
> >
> > src/backend/access/heap/hio.c | 20 +++++++++++++++++++-
>
> Each GetPageWithFreeSpace() caller would need this, not just heap. Please
> evaluate whether this logic belongs inside freespace.c vs. in each caller.
> freespace.c comments and/or freespace/README may need updates. Please
> evaluate header comments of freespace.c functions whose callers you are
> changing, and evaluate comments about truncation.

I implemented the new behaviour in fsm_search and
RecordAndGetPageWithFreeSpace.

Regards,

--
Ronan

Attachment Content-Type Size
v2-0001-Detect-FSM-corruption-and-repair-it.patch text/x-patch 6.1 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message David Rowley 2024-03-07 11:32:15 Re: BUG #18381: Invalid query generated by postgres_fdw with UNION ALL and ORDER BY
Previous Message PG Bug reporting form 2024-03-07 10:22:33 BUG #18381: Invalid query generated by postgres_fdw with UNION ALL and ORDER BY