Re: WAL record CRC calculated incorrectly because of underlying buffer modification

From: Alexander Lakhin <exclusion(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WAL record CRC calculated incorrectly because of underlying buffer modification
Date: 2024-05-11 04:00:00
Message-ID: 0a2e14d3-fbc8-9e86-a968-e0c4a465e5b0@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello Thomas and Andres,

Thank you for looking at this!

11.05.2024 06:26, Thomas Munro wrote:
> On Sat, May 11, 2024 at 3:57 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
>> On 2024-05-10 16:00:01 +0300, Alexander Lakhin wrote:
>>> and discovered that XLogRecordAssemble() calculates CRC over a buffer,
>>> that might be modified by another process.
>> If, with "might", you mean that it's legitimate for that buffer to be
>> modified, I don't think so. The bug is that something is modifying the buffer
>> despite it being exclusively locked.
>>
>> I.e. what we likely have here is a bug somewhere in the hash index code.
> I don't have a good grip on the higher level locking protocols of
> hash.c, but one microscopic thing jumps out:
>
> /*
> * bucket buffer was not changed, but still needs to be
> * registered to ensure that we can acquire a cleanup lock on
> * it during replay.
> */
> if (!xlrec.is_primary_bucket_page)
> {
> uint8 flags = REGBUF_STANDARD |
> REGBUF_NO_IMAGE | REGBUF_NO_CHANGE;
>
> XLogRegisterBuffer(0, bucket_buf, flags);
> }
>
> That registers a buffer that is pinned but not content-locked, and we
> tell xloginsert.c not to copy its image into the WAL, but it does it
> anyway because:
>
> /*
> * If needs_backup is true or WAL checking is enabled for current
> * resource manager, log a full-page write for the current block.
> */
> include_image = needs_backup || (info & XLR_CHECK_CONSISTENCY) != 0;
>
> So I guess it copies the image on dodo, which has:
>
> 'PG_TEST_EXTRA' => 'ssl ldap
> kerberos wal_consistency_checking libpq_encryption xid_wraparound'
>
> Perhaps a no-image, no-change registered buffer should not be
> including an image, even for XLR_CHECK_CONSISTENCY? It's actually
> useless for consistency checking too I guess, this issue aside,
> because it doesn't change anything so there is nothing to check.

Yes, I think something wrong is here. I've reduced the reproducer to:
cat << 'EOF' | psql
CREATE TABLE hash_cleanup_heap(keycol INT);
CREATE INDEX hash_cleanup_index on hash_cleanup_heap USING HASH (keycol);

BEGIN;
INSERT INTO hash_cleanup_heap SELECT 1 FROM generate_series(1, 500) as i;
ROLLBACK;
EOF

cat << 'EOF' | psql &
INSERT INTO hash_cleanup_heap SELECT 1 FROM generate_series(1, 500) as i;

DROP TABLE hash_cleanup_heap;
EOF

cat << 'EOF' | psql &
SELECT pg_sleep(random() / 20);
VACUUM hash_cleanup_heap;
EOF
wait
grep 'TRAP:' server.log

("wal_consistency_checking = all" and the xloginsert patch are still required)

and with additional logging I see:
2024-05-11 03:45:23.424 UTC|law|regression|663ee9d3.1f96dd|LOG: !!!hashbucketcleanup| scan page buf: 1832
2024-05-11 03:45:23.424 UTC|law|regression|663ee9d3.1f96dd|CONTEXT: while vacuuming index "hash_cleanup_index" of
relation "public.hash_cleanup_heap"
2024-05-11 03:45:23.424 UTC|law|regression|663ee9d3.1f96dd|STATEMENT:  VACUUM hash_cleanup_heap;
2024-05-11 03:45:23.424 UTC|law|regression|663ee9d3.1f96de|LOG: !!!_hash_doinsert| _hash_getbucketbuf_from_hashkey: 1822
2024-05-11 03:45:23.424 UTC|law|regression|663ee9d3.1f96de|STATEMENT:  INSERT INTO hash_cleanup_heap SELECT 1 FROM
generate_series(1, 500) as i;
2024-05-11 03:45:23.424 UTC|law|regression|663ee9d3.1f96dd|LOG: !!!hashbucketcleanup| xlrec.is_primary_bucket_page: 0,
buf: 1832, bucket_buf: 1822
2024-05-11 03:45:23.424 UTC|law|regression|663ee9d3.1f96dd|CONTEXT: while vacuuming index "hash_cleanup_index" of
relation "public.hash_cleanup_heap"
2024-05-11 03:45:23.424 UTC|law|regression|663ee9d3.1f96dd|STATEMENT:  VACUUM hash_cleanup_heap;
2024-05-11 03:45:23.424 UTC|law|regression|663ee9d3.1f96de|LOG: !!!_hash_doinsert| _hash_relbuf(rel, 1822)
2024-05-11 03:45:23.424 UTC|law|regression|663ee9d3.1f96de|STATEMENT:  INSERT INTO hash_cleanup_heap SELECT 1 FROM
generate_series(1, 500) as i;
TRAP: failed Assert("memcmp(block1_ptr, block1_copy, block1_len) == 0"), File: "xloginsert.c", Line: 949, PID: 2070237

Best regards,
Alexander

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2024-05-11 04:25:44 Re: WAL record CRC calculated incorrectly because of underlying buffer modification
Previous Message Thomas Munro 2024-05-11 03:26:26 Re: WAL record CRC calculated incorrectly because of underlying buffer modification