Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum

From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org, Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Alexander Lakhin <exclusion(at)gmail(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
Subject: Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum
Date: 2021-11-19 18:55:41
Message-ID: EBDC55C1-5C00-42C2-BFA2-E6882E5E1AF9@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi,

On November 18, 2021 6:55:08 PM PST, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>On Thu, Nov 18, 2021 at 4:36 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
>> Attached is such an isolationtest. In an unmodified HEAD it ends up with
>
>> without amcheck detecting corruption at that point:(.
>
>The best way to teach amcheck to detect this kind of thing is bound to
>be verification of HOT chains themselves:
>
>https://www.postgresql.org/message-id/flat/CAH2-Wznphpd490o%2Bur_bi6-YmC47hRu3Qyxod72RLqC-Uvuhcg%40mail.gmail.com

That'd not reliably catch this kind of thing, because the hot chains easily can end up correct enough looking.

>It would also be nice to teach verify_nbtree.c to look out for the
>presence of index tuples that shouldn't be there during heapallindexed
>verification, but that's significantly harder.

I think what we really need is a pass that verifies that heap tuples reached via the index actually match the index key. There's just too many things that can break that. It'll not be blazingly fast, but there's just way too many things that other approaches just won't find.

Andres

--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Thomas Munro 2021-11-19 19:35:15 Re: conchuela timeouts since 2021-10-09 system upgrade
Previous Message Scott Mead 2021-11-19 18:36:49 Re: [BUG] Autovacuum not dynamically decreasing cost_limit and cost_delay