Re: BUG #17245: Index corruption involving deduplicated entries

From: Andres Freund <andres(at)anarazel(dot)de>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Kamigishi Rei <iijima(dot)yun(at)koumakan(dot)jp>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: BUG #17245: Index corruption involving deduplicated entries
Date: 2021-10-30 20:42:46
Message-ID: 20211030204246.d6zda3sa46bqpo5l@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi,

On 2021-10-30 11:46:22 -0700, Peter Geoghegan wrote:
> Attached is a draft patch that fixes the problem.

I think it probably is worth adding an error check someplace that verifies
that problems of this kind will be detected with, uh, less effort.

I think it'd also be good to add a test that specifically verifies that
parallel vacuum doesn't have a bug around "parallel worthy" and not "parallel
worthy" indexes. It's too easy a mistake to make, and because visible
corruption is delayed, it's likely that we won't detect such cases.

> Also attached is a second patch. This adds assertions to
> heap_index_delete_tuples() to catch cases where a heap TID in an index
> points to an LP_UNUSED item in the heap (which is what this bug looked
> like, mostly). It also checks for certain more or less equivalent
> inconsistencies: the case where a heap TID in an index points to a
> line pointer that's past the end of the heap page's line pointer
> array, and the case where a heap TID in an index points directly to a
> heap-only tuple.

ISTM that at least a basic version of this is worth doing as a check throwing
an ERROR, rather than an assertion. It's hard to believe this'd be a
significant portion of the cost of heap_index_delete_tuples(), and I think it
would help catch problems a lot earlier.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2021-10-30 20:50:48 Re: BUG #17254: Crash with 0xC0000409 in pg_stat_statements when pg_stat_tmp\pgss_query_texts.stat exceeded 2GB.
Previous Message Kamigishi Rei 2021-10-30 19:42:07 Re: BUG #17245: Index corruption involving deduplicated entries