Re: Extending amcheck to check toast size and compression

From: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Extending amcheck to check toast size and compression
Date: 2021-10-20 16:41:10
Message-ID: DE60E400-914B-47CD-9A01-1A82ABEE3FCA@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Oct 19, 2021, at 1:58 PM, Greg Stark <stark(at)mit(dot)edu> wrote:
>
> Right so here's a review.
>
> I think the patch is committable as is. It's an improvement and it
> does the job as promised. I do have some comments but I don't think
> they're serious issues and would actually be pretty happy committing
> it as is. Fwiw I didn't realize how short the patch was at first and
> it probably doesn't need yet another review.

Thanks for reviewing!

> /* Toasted attributes too large to be untoasted should never be stored */
> if (toast_pointer.va_rawsize > VARLENA_SIZE_LIMIT)
>
> 1) I know this used to say MaxAlloc -- personally I would probably go
> with that but either is fine. But if you're going to have a separate
> constant there could be more of a comment explaining why that's the
> maximum -- probably with a pointer to MaxAlloc and postgres.h's
> VARSIZE macros.

I find the comment a bit verbose that way, but maybe people like it better? How does this look:

diff --git a/contrib/amcheck/verify_heapam.c b/contrib/amcheck/verify_heapam.c
index 774a70f63d..988e104d8e 100644
--- a/contrib/amcheck/verify_heapam.c
+++ b/contrib/amcheck/verify_heapam.c
@@ -30,7 +30,11 @@ PG_FUNCTION_INFO_V1(verify_heapam);
/* The number of columns in tuples returned by verify_heapam */
#define HEAPCHECK_RELATION_COLS 4

-/* The largest valid toast va_rawsize */
+/*
+ * The largest valid toast va_rawsize. This is the same as the MaxAllocSize
+ * constant from memutils.h, and is the largest size that can fit in a varlena
+ * va_header's 30-bit size field.
+ */
#define VARLENA_SIZE_LIMIT 0x3FFFFFFF

/*

> The switch statement at line 1443 seems a bit ... baroque. Is it
> clearer than a simple "if cmid != TOAST_PGLZ_COMPRESSION_ID && cmid !=
> TOAST_LZ4_COMPRESSION_ID)" ? I mean, I see this is easier to add more
> cases to but I found dealing with a case that falls through and no
> default is a lot of cognitive overhead to understand what is in the
> end just effectively a simple branch.

The variable cmid (which stands for compression method identifier), is of enum type ToastCompressionId. From toast_compression.h:

typedef enum ToastCompressionId
{
TOAST_PGLZ_COMPRESSION_ID = 0,
TOAST_LZ4_COMPRESSION_ID = 1,
TOAST_INVALID_COMPRESSION_ID = 2
} ToastCompressionId;

There is clearly room for one more compression algorithm in that list without overflowing the 2 bits reserved for such values, and I'd not like to gamble on some future hacker who adds TOAST_MY_FANCY_COMPRESSION_ID = 3 remembering to update contrib/amcheck. I used a switch statement to trigger a compiler warning in such an event.

> Fwiw compilers aren't always the best at optimizing switch statements.
> It's entirely possible a compiler may end up building a whole lookup
> table of jumps for this thing. Not that it's performance critical but
> ...

That may be a fair argument, but I'm a huge fan of using enums and switch statements to elicit the compiler's help in future modifications to the code. This is the first time I've heard a complaint of this sort and I'm unsure how to respond. How common is this optimization problem on modern compilers?

> But all that's more words than necessary for a minor style comment.

Ok.

> Fwiw I spent a few minutes thinking about and writing up this
> suggestion and then only afterwards realized the code in question
> wasn't from this patch. I'll mention it anyways but it's not relevant
> to this patch review I guess :)

Sure, we can discuss it.

> I found the whole expected_chunk_seq parameter thing a bit confusing
> and less useful than possible. I would instead suggestion:
>
> Allocate an array of the expected number of chunk numbers before
> calling check_toast_tuple and then just gather the chunk_seq that are
> returned.

Of course, you might get more chunks back than you expected, and overflow your array. But assuming you realloc, and assuming the checker avoids going into an infinite loop, that is one option.

> When it's finished you can do things like: a) Check if
> they're all ascending and report index corruption if not. b) Check if
> any numbers are missing and report which ones are missing and/or how
> many. c) Check if there are duplicates and report that. These would
> all be easier for a user to interpret than "index scan returned chunk
> 5 when expecting chunk 9".

This was reworked multiple times. The problem is how to think about the missing or extra chunks. One interpretation is that the chunks themselves are corrupt, but another interpretation is that the toast index is corrupt and causing the index scan over the toast table to visit the same chunk multiple times, or in the wrong order, etc. The index scan itself might bomb out with a segfault, or go into an infinite loop. It's hard to predict such things in the face of corruption, especially when considering that the index scan code might be modified in the future. I'm not claiming there is no room for improvement here -- likely there is -- but it is not simple, and the patch that would result would be larger than the patch actually being reviewed. I'd rather leave such a project for another day.


Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Anders Kaseorg 2021-10-20 17:09:51 Re: [PATCH] Prefer getenv("HOME") to find the UNIX home directory
Previous Message David G. Johnston 2021-10-20 16:35:19 Re: Some questions about schema privileges