Re: [WIP] [B-Tree] Retail IndexTuple deletion

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: "Andrey V(dot) Lepikhov" <a(dot)lepikhov(at)postgrespro(dot)ru>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [WIP] [B-Tree] Retail IndexTuple deletion
Date: 2018-06-22 20:14:05
Message-ID: CAH2-Wzm7BS151fccSmeP2D8TBOEc5k6RjKtEM_zTeHJRzXi4qA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jun 22, 2018 at 12:43 PM, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> On Fri, Jun 22, 2018 at 4:24 AM, Andrey V. Lepikhov
> <a(dot)lepikhov(at)postgrespro(dot)ru> wrote:
>> According to your feedback, i develop second version of the patch.
>> In this version:
>> 1. High-level functions index_beginscan(), index_rescan() not used. Tree
>> descent made by _bt_search(). _bt_binsrch() used for positioning on the
>> page.
>> 2. TID list introduced in amtargetdelete() interface. Now only one tree
>> descent needed for deletion all tid's from the list with equal scan key
>> value - logical duplicates deletion problem.
>> 3. Only one WAL record for index tuple deletion per leaf page per
>> amtargetdelete() call.
>
> Cool.
>
> What is this "race" code about?

I noticed another bug in your patch, when running a
"wal_consistency_checking=all" smoke test. I do this simple, generic
test for anything that touches WAL-logging, actually -- it's a good
practice to adopt.

I enable "wal_consistency_checking=all" on the installation, create a
streaming replica with pg_basebackup (which also has
"wal_consistency_checking=all"), and then run "make installcheck"
against the primary. Here is what I see on the standby when I do this
with v2 of your patch applied:

9524/2018-06-22 13:03:12 PDT LOG: entering standby mode
9524/2018-06-22 13:03:12 PDT LOG: consistent recovery state reached
at 0/30000D0
9524/2018-06-22 13:03:12 PDT LOG: invalid record length at 0/30000D0:
wanted 24, got 0
9523/2018-06-22 13:03:12 PDT LOG: database system is ready to accept
read only connections
9528/2018-06-22 13:03:12 PDT LOG: started streaming WAL from primary
at 0/3000000 on timeline 1
9524/2018-06-22 13:03:12 PDT LOG: redo starts at 0/30000D0
9524/2018-06-22 13:03:32 PDT FATAL: inconsistent page found, rel
1663/16384/1259, forknum 0, blkno 0
9524/2018-06-22 13:03:32 PDT CONTEXT: WAL redo at 0/3360B00 for
Heap2/CLEAN: remxid 599
9523/2018-06-22 13:03:32 PDT LOG: startup process (PID 9524) exited
with exit code 1
9523/2018-06-22 13:03:32 PDT LOG: terminating any other active server processes
9523/2018-06-22 13:03:32 PDT LOG: database system is shut down

I haven't investigated this at all, but I assume that the problem is a
simple oversight. The new ItemIdSetDeadRedirect() concept that you've
introduced probably necessitates changes in both the WAL logging
routines and the redo/recovery routines. You need to go make those
changes. (By the way, I don't think you should be using the constant
"3" with the ItemIdIsDeadRedirection() macro definition.)

Let me know if you get stuck on this, or need more direction.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2018-06-22 20:54:52 Re: bug with expression index on partition
Previous Message Alvaro Herrera 2018-06-22 20:03:51 Re: [PATCH] Include application_name in "connection authorized" log message