Re: FSM Corruption (was: Could not read block at end of the relation)

From: Noah Misch <noah(at)leadboat(dot)com>
To: Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-bugs <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: FSM Corruption (was: Could not read block at end of the relation)
Date: 2024-04-06 22:30:37
Message-ID: 20240406223037.c5@rfd.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Your v3 has the right functionality. As further confirmation of the fix, I
tried reverting the non-test parts of commit 917dc7d "Fix WAL-logging of FSM
and VM truncation". That commit's 008_fsm_truncation.pl fails with 917dc7d
reverted from master, and adding this patch makes it pass again. I ran
pgindent and edited comments. I think the attached version is ready to go.

While updating comments in FreeSpaceMapPrepareTruncateRel(), I entered a
rabbit hole about the comments 917dc7d left about torn pages. I'm sharing
these findings just in case it helps a reader of the $SUBJECT patch avoid the
same rabbit hole. Both fsm and vm read with RBM_ZERO_ON_ERROR, so I think
they're fine with torn pages. Per the README sentences I'm adding, FSM could
stop writing WAL. I'm not proposing that, but I do bet it's the right thing.
visibilitymap_prepare_truncate() has mirrored fsm truncate since 917dc7d. The
case for removing WAL there is clearer still, because parallel function
visibilitymap_clear() does not write WAL. I'm attaching a WIP patch to remove
visibilitymap_prepare_truncate() WAL. I'll abandon that or pursue it for v18,
in a different thread.

On Fri, Mar 29, 2024 at 09:47:52AM +0100, Ronan Dunklau wrote:
> Le jeudi 21 mars 2024, 14:51:25 CET Ronan Dunklau a écrit :
> > Le samedi 16 mars 2024, 05:58:34 CET Noah Misch a écrit :
> > > > - Get profiles with both master and patched. (lseek or freespace.c
> > > > functions>
> > > >
> > > > rising by 0.1%-1% would fit what we know.)
>
> Running perf during the same benchmark, I get the following samples for lseek:
>
> With the patch:
>
> 0.02% postgres libc.so.6 [.] llseek(at)GLIBC_2(dot)2(dot)5
>
> Without the patch:
>
> 0.01% postgres libc.so.6 [.] llseek(at)GLIBC_2(dot)2(dot)5
>
>
> So nothing too dramatic.
> I haven't been able to come up with a better benchmark.

I tried, without success, to reproduce the benefit of commit 719c84c "Extend
relations multiple blocks at a time to improve scalability" like
https://www.postgresql.org/message-id/flat/CA%2BTgmob7xED4AhoqLspSOF0wCMYEomgHfuVdzNJnwWVoE_c60g%40mail.gmail.com
did. My conditions must be lacking whatever factor caused the consistent >20%
improvement in the linked message. Environment was ext4 on magnetic disk,
Linux 3.10.0-1160.99.1.el7.x86_64, 8 threads. I'm attaching the script I used
to test 719c84c^, 719c84c, and [719c84c with GetPageWithFreeSpace() made to
always find nothing]. I'm also attaching the script's output after filtration
through "grep ^TIME | sort -n". The range of timings within each test
scenario dwarfs any difference across scenarios. I casually tried some other
variations without obvious change:

- 24MB shared_buffers
- 128MB shared_buffers
- 4 threads
- 32 threads
- bgwriter_flush_after & backend_flush_after at their defaults, instead of 0
- 2024-04 code (e2a2357) unmodified, then with GetPageWithFreeSpace() made to always find nothing
- logged tables
- INSERT ... SELECT instead of COPY

If I were continuing the benchmark study, I would try SSD, a newer kernel,
and/or shared_buffers=48GB. Instead, since your perf results show only +0.01%
CPU from new lseek() calls, I'm going to stop there and say it's worth taking
the remaining risk that some realistic scenario gets a material regression
from those new lseek() calls.

Attachment Content-Type Size
bench-bulk-extend.sh application/x-sh 3.0 KB
bulk-extend-times.txt text/plain 28.8 KB
fsm-past-end-v4nm.patch text/plain 10.3 KB
vm-torn-page-fine-v0.1.patch text/plain 4.5 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Jeff Janes 2024-04-06 23:25:08 Re: BUG #18423: suboptimal query plan is used when ordering by an indexed field with limit
Previous Message PG Bug reporting form 2024-04-06 19:14:03 BUG #18423: suboptimal query plan is used when ordering by an indexed field with limit