Re: pg_amcheck contrib application

From: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Noah Misch <noah(at)leadboat(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>, Stephen Frost <sfrost(at)snowman(dot)net>, Michael Paquier <michael(at)paquier(dot)xyz>, Amul Sul <sulamul(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_amcheck contrib application
Date: 2021-04-08 19:02:36
Message-ID: E4F162F1-0DF4-4919-BD75-AF73F125D1E8@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Apr 7, 2021, at 1:16 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Sun, Apr 4, 2021 at 8:02 PM Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com> wrote:
>> v18-0001 - Finishes work started in commit 3b6c1259f9 that was overlooked owing to how I had separated the changes in v17-0002 vs. v17-0003
>
> Committed.

Thank you.

>> v18-0002 - Postpones the toast checks for a page until after the main table page lock is released
>
> Committed, but I changed list_free() to list_free_deep() to avoid a
> memory leak, and I revised the commit message to mention the important
> point that we need to avoid following TOAST pointers from
> potentially-prunable tuples.

Thank you, and yes, I agree with that change.

>> v18-0003 - Improves the corruption messages in ways already discussed earlier in this thread. Changes the tests to expect the new messages, but adds no new checks
>
> Kibitizing your message wording:
>
> "toast value %u chunk data is null" -> "toast value %u chunk %d has
> null data". We can mention the chunk number this way.

Changed.

> "toast value %u corrupt extended chunk has invalid varlena header: %0x
> (sequence number %d)" -> "toast value %u chunk %d has invalid varlena
> header %0x". We can be more consistent about how we incorporate the
> chunk number into the text, and we don't really need to include the
> word corrupt, because all of these are corruption complaints, and I
> think it looks better without the colon.

Changed.

> "toast value %u chunk sequence number %u does not match the expected
> sequence number %u" -> "toast value %u contains chunk %d where chunk
> %d was expected". Shorter. Uses %d for a sequence number instead of
> %u, which I think is correct -- anyway we should have them all one way
> or all the other. I think I'd rather ditch the "sequence number"
> technology and just talk about "chunk %d" or whatever.

I don't agree with this one. I do agree with changing the message, but not to the message you suggest.

Imagine a toasted attribute with 18 chunks numbered [0..17]. Then we update the toast to have only 6 chunks numbered [0..5] except we corruptly keep chunks numbered [12..17] in the toast table. We'd rather see a report like this:

# heap table "postgres"."public"."test", block 0, offset 1, attribute 2:
# toast value 16444 chunk 6 has sequence number 12, but expected sequence number 6
# heap table "postgres"."public"."test", block 0, offset 1, attribute 2:
# toast value 16444 chunk 7 has sequence number 13, but expected sequence number 7
# heap table "postgres"."public"."test", block 0, offset 1, attribute 2:
# toast value 16444 chunk 8 has sequence number 14, but expected sequence number 8
# heap table "postgres"."public"."test", block 0, offset 1, attribute 2:
# toast value 16444 chunk 9 has sequence number 15, but expected sequence number 9
# heap table "postgres"."public"."test", block 0, offset 1, attribute 2:
# toast value 16444 chunk 10 has sequence number 16, but expected sequence number 10
# heap table "postgres"."public"."test", block 0, offset 1, attribute 2:
# toast value 16444 chunk 11 has sequence number 17, but expected sequence number 11
# heap table "postgres"."public"."test", block 0, offset 1, attribute 2:
# toast value 16444 was expected to end at chunk 6, but ended at chunk 12

than one like this:

# heap table "postgres"."public"."test", block 0, offset 1, attribute 2:
# toast value 16444 contains chunk 12 where chunk 6 was expected
# heap table "postgres"."public"."test", block 0, offset 1, attribute 2:
# toast value 16444 contains chunk 13 where chunk 7 was expected
# heap table "postgres"."public"."test", block 0, offset 1, attribute 2:
# toast value 16444 contains chunk 14 where chunk 8 was expected
# heap table "postgres"."public"."test", block 0, offset 1, attribute 2:
# toast value 16444 contains chunk 15 where chunk 9 was expected
# heap table "postgres"."public"."test", block 0, offset 1, attribute 2:
# toast value 16444 contains chunk 16 where chunk 10 was expected
# heap table "postgres"."public"."test", block 0, offset 1, attribute 2:
# toast value 16444 contains chunk 17 where chunk 11 was expected
# heap table "postgres"."public"."test", block 0, offset 1, attribute 2:
# toast value 16444 was expected to end at chunk 6, but ended at chunk 12

because saying the toast value ended at "chunk 12" after saying that it contains "chunk 17" is contradictory. You need the distinction between the chunk number and the chunk sequence number, since in corrupt circumstances they may not be the same.

> "toast value %u chunk sequence number %u exceeds the end chunk
> sequence number %u" -> "toast value %u chunk %d follows last expected
> chunk %d"

Changed.

> "toast value %u chunk size %u differs from the expected size %u" ->
> "toast value %u chunk %d has size %u, but expected size %u"

Changed.

> Other complaints:
>
> Your commit message fails to mention the addition of
> VARATT_EXTERNAL_GET_POINTER, which is a significant change/bug fix
> unrelated to message wording.

Right you are.

> It feels like we have a non-minimal number of checks/messages for the
> series of toast chunks. I think that if we find a chunk after the last
> chunk we were expecting to find (curchunk > endchunk) and you also get
> a message if we have the wrong number of chunks in total (chunkno !=
> (endchunk + 1)). Now maybe I'm wrong, but if the first message
> triggers, it seems like the second message must also trigger. Is that
> wrong? If not, maybe we can get rid of the first one entirely? That's
> such a small change I think we could include it in this same patch, if
> it's a correct idea.

Motivated by discussions we had off-list, I dug into this one.

Purely as manual testing, and not part of the patch, I hacked the backend a bit to allow direct modification of the toast table. After corrupting the toast with the following bit of SQL:

WITH chunk_limit AS (
SELECT chunk_id, MAX(chunk_seq) AS maxseq
FROM $toastname
GROUP BY chunk_id)
INSERT INTO $toastname (chunk_id, chunk_seq, chunk_data)
(SELECT t.chunk_id,
t.chunk_seq + cl.maxseq + CASE WHEN t.chunk_seq < 3 THEN 1 ELSE 7 END,
t.chunk_data
FROM $toastname t
INNER JOIN chunk_limit cl
ON t.chunk_id = cl.chunk_id)

pg_amcheck reports the following corruption messages:

# heap table "postgres"."public"."test", block 0, offset 1, attribute 2:
# toast value 16444 chunk 6 follows last expected chunk 5
# heap table "postgres"."public"."test", block 0, offset 1, attribute 2:
# toast value 16444 chunk 7 follows last expected chunk 5
# heap table "postgres"."public"."test", block 0, offset 1, attribute 2:
# toast value 16444 chunk 8 follows last expected chunk 5
# heap table "postgres"."public"."test", block 0, offset 1, attribute 2:
# toast value 16444 chunk 9 has sequence number 15, but expected sequence number 9
# heap table "postgres"."public"."test", block 0, offset 1, attribute 2:
# toast value 16444 chunk 10 has sequence number 16, but expected sequence number 10
# heap table "postgres"."public"."test", block 0, offset 1, attribute 2:
# toast value 16444 chunk 11 has sequence number 17, but expected sequence number 11
# heap table "postgres"."public"."test", block 0, offset 1, attribute 2:
# toast value 16444 was expected to end at chunk 6, but ended at chunk 12

I think if we'd left out the first three messages, it would read strangely. We would be complaining about three chunks with the wrong sequence number, then conclude that there were six extra chunks. A sufficiently savvy user might deduce the presence of chunks 6, 7, and 8, but the problem is more obvious (to my eyes, at least) if we keep the first three messages. This seems like a judgement call and not a clear argument either way, so if you still want me to change it, I guess I don't mind doing so.

> On a related note, as I think I said before, I still think we should
> be rejiggering this so that we're not testing both the size of each
> individual chunk and the total size, because that ought to be
> redundant. That might be better done as a separate patch but I think
> we should try to clean it up.

Can you point me to the exact check you are mentioning, and with which patch applied? I don't see any examples of this after applying the v18-0003.

>> v18-0004 - Adding corruption checks of toast pointers. Extends the regression tests to cover the new checks.
>
> I think we could check that the result of
> VARATT_EXTERNAL_GET_COMPRESS_METHOD is one of the values we expect to
> see.

Yes. I had that before, pulled it out along with other toast compression checks, but have put it back in for v19.

> Using AllocSizeIsValid() seems pretty vile. I know that MaxAllocSize
> is 0x3FFFFFFF in no small part because that's the maximum length that
> can be represented by a varlena, but I'm not sure it's a good idea to
> couple the concepts so closely like this. Maybe we can just #define
> VARLENA_SIZE_LIMIT in this file and use that, and a message that says
> size %u exceeds limit %u.

Changed.

> I'm a little worried about whether the additional test cases are
> Endian-dependent at all. I don't immediately know what might be wrong
> with them, but I'm going to think about that some more later. Any
> chance you have access to a Big-endian box where you can test this?

I don't have a Big-endian box, but I think one of them may be wrong now that you mention the issue:

# Corrupt column c's toast pointer va_extinfo field

The problem is that the 30-bit extsize and 2-bit cmid split is not being handled in the perl test, and I don't see an easy way to have perl's pack/unpack do that for us. There isn't any requirement that each possible corruption we check actually be manifested in the regression tests. The simplest solution is to remove this problematic test, so that's what I did. The other two new tests corrupt c_va_toastrelid and c_va_rawsize, both of which are read/written using unpack/pack, so perl should handle the endianness for us (I hope).

If you'd rather not commit these two extra tests, you don't have to, as I've split them out into v19-0003. But if you do commit them, it makes more sense to me to be one commit with 0002+0003 together, rather than separately. Not committing the new tests just means that verify_heapam() is able to detect additional forms of corruption that we're not covering in the regression tests. But that's already true for some other corruption types, such as detecting toast chunks with null sequence numbers.

Attachment Content-Type Size
v19-0001-amcheck-rewording-messages-and-fixing-alignment.patch application/octet-stream 8.1 KB
v19-0002-amcheck-adding-toast-pointer-corruption-checks.patch application/octet-stream 4.2 KB
v19-0003-amcheck-additional-regression-test-coverage.patch application/octet-stream 3.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Dilger 2021-04-08 19:07:27 Re: multi-install PostgresNode fails with older postgres versions
Previous Message Robert Haas 2021-04-08 18:58:04 Re: [HACKERS] Custom compression methods