Re: Preventing indirection for IndexPageGetOpaque for known-size page special areas

From: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Preventing indirection for IndexPageGetOpaque for known-size page special areas
Date: 2022-03-28 15:09:10
Message-ID: CAEze2Wjjx4REsge5kxNjDjKDVvkqbruqxVLykxj_UufZ0xzbOw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 28 Mar 2022 at 06:33, Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Wed, Feb 16, 2022 at 10:24:58PM +0100, Matthias van de Meent wrote:
> > A first good reason to do this is preventing further damage when a
> > page is corrupted: if I can somehow overwrite pd_special,
> > non-assert-enabled builds would start reading and writing at arbitrary
> > offsets from the page pointer, quite possibly in subsequent buffers
> > (or worse, on the stack, in case of stack-allocated blocks).
>
> Well, pageinspect has proved to be rather careless in this area for a
> couple of years, but this just came from the fact that we called
> PageGetSpecialPointer() before checking that PageGetSpecialSize() maps
> with the MAXALIGN'd size of the opaque area, if the related AM has
> one.
>
> I am not sure that we need to worry about corruptions, as data
> checksums would offer protection for most cases users usually need to
> worry about, at the exception of page corruptions because of memory
> for pages already read in the PG shared buffers from disk or even the
> OS cache.

Not all clusters have checksums enabled (boo!, but we can't
realistically fix that), so on-disk corruption could reasonably
propagate to the rest of such system. Additionally, checksums are only
checked on read, and updated when the page is written to disk (in
PageSetChecksumCopy called from FlushBuffer), but this does not check
for signs of page corruption. As such, a memory bug resulting in
in-memory corruption of pd_special would persist and would currently
have the potential to further corrupt neighbouring buffers.

> > A second reason would be less indirection to get to the opaque
> > pointer. This should improve performance a bit in those cases where we
> > (initially) only want to access the [Index]PageOpaque struct.
>
> We don't have many cases that do that, do we?

Indeed not many, but I know that at least the nbtree-code has some
cases where it uses the opaque before other fields in the header are
accessed (sometimes even without accessing the header for other
reasons): in _bt_moveright and _bt_walk_left. There might be more; but
these are cases I know of.

> > Lastly, 0002 and 0003 remove some repetitive tasks of dealing with the
> > [nbtree, hash] opaques by providing a typed accessor macro similar to
> > what is used in the GIN and (SP-)GIST index methods; improving the
> > legibility of the code and decreasing the churn.
>
> +#define PageGetSpecialOpaque(page, OpaqueDataTyp) \
> +( \
> + AssertMacro(PageGetPageSize(page) == BLCKSZ && \
> + PageGetSpecialSize(page) == MAXALIGN(sizeof(OpaqueDataTyp))), \
> + (OpaqueDataTyp *) ((char *) (page) + (BLCKSZ - MAXALIGN(sizeof(OpaqueDataTyp)))) \
> +)
>
> Did you notice PageValidateSpecialPointer() that mentions MSVC? Could
> this stuff a problem if not an inline function.

I'm not sure; but the CFbot build (using whatever MSVC is used with
latest Window Server 2019, VS 19) failed to complain in this case.
Furthermore, the case mentioned in the comment of
PageValidateSpecialPointer() refers to problems that only happened
after the macro was updated from one MacroAssert to three seperate
MacroAssert()s; this new code currently only has one, so we should be
fine for now.

> Perhaps you should do a s/OpaqueDataTyp/OpaqueDataType/.

This has been fixed in attached v2; and apart from tweaks in the
commit messages there have been no other changes.

> As a whole, this patch set looks like an improvement in terms of
> checks and consistency regarding the special area handling across the
> different AMs for the backend code, while reducing the MAXALIGN-ism on
> all those sizes, and this tends to be missed.

Thanks!

-Matthias

Attachment Content-Type Size
v2-0001-Add-known-size-pre-aligned-special-area-pointer-m.patch application/x-patch 4.0 KB
v2-0002-Update-nbtree-code-to-use-PageGetSpecialOpaque-re.patch application/x-patch 44.7 KB
v2-0003-Update-hash-code-to-use-PageGetSpecialOpaque-repl.patch application/x-patch 22.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-03-28 15:20:42 Re: Add a pg_get_query_def function (was Re: Deparsing rewritten query)
Previous Message Alvaro Herrera 2022-03-28 15:06:14 Re: support for MERGE