Re: FSM Corruption (was: Could not read block at end of the relation)

From: Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
To: Noah Misch <noah(at)leadboat(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>
Cc: pgsql-bugs <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: FSM Corruption (was: Could not read block at end of the relation)
Date: 2024-03-06 09:31:17
Message-ID: 1958756.PYKUYFuaPT@aivenlaptop
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Le mardi 5 mars 2024, 00:05:03 CET Noah Misch a écrit :
> I would guess this one is more risky from a performance perspective, since
> we'd be adding to a hotter path under RelationGetBufferForTuple(). Still,
> it's likely fine.

I ended up implementing this in the attached patch. The idea is that we detect
if the FSM returns a page past the end of the relation, and ignore it.
In that case we will fallback through the extension mechanism.

For the corrupted-FSM case it is not great performance wise, as we will extend
the relation in small steps every time we find a non existing block in the FSM,
until the actual relation size matches what is recorded in the FSM. But since
those seldom happen, I figured it was better to keep the code really simple for
a bugfix.

I wanted to test the impact in terms of performance, and I thought about the
worst possible case for this.

Then, run a pgbench doing insertions in the table. With the attached patch the
worst case I could come up with is:
- remember which page we last inserted into
- notice we don't have enough space
- ask the FSM for a block
- now have to compare that to the actual relation size

So I came up with the following initialization steps:

- create a table with vacuum_truncate = off, with a tuple size big enough that
it's impossible to fit two tuples on the same page
- insert lots of tuple in it until it reaches a decent size
- delete them all
- vacuum
- all of this fitting in shared_buffers

As in:

CREATE TABLE test_perf (c1 char(5000));
ALTER TABLE test_perf ALTER c1 SET STORAGE PLAIN;
ALTER TABLE test_perf SET (VACUUM_TRUNCATE = off);
INSERT INTO test_perf (c1) SELECT 'c' FROM generate_series(1, 1000000);
DELETE FROM test_perf;
VACUUM test_perf;

Then I ran pgbench with a single client, with a script only inserting the same
value over and over again, for 1000000 transactions (initial table size).

I noticed no difference running with or without the patch, but maybe someone
else can try to run that or find another adversarial case ?

Best regards,

--
Ronan Dunklau

Attachment Content-Type Size
0001-Detect-invalid-FSM-when-finding-a-block.patch text/x-patch 2.3 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Devrim Gündüz 2024-03-06 10:35:39 Re: Issue with PostgreSQL 11 RPM Package Availability
Previous Message Changqing Li 2024-03-06 08:51:41 A build failure since only include header "postgresql/server/port.h"