Re: Spurious "apparent wraparound" via SimpleLruTruncate() rounding

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Spurious "apparent wraparound" via SimpleLruTruncate() rounding
Date: 2019-07-24 08:27:18
Message-ID: CAKPRHzJD_Y2VSJjmYZg7+8yFq6cMJVFWJURK7AXTX6OuZeuAig@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Sorry in advance for link-breaking message forced by gmail..

https://www.postgresql.org/message-id/flat/20190202083822(dot)GC32531(at)gust(dot)leadboat(dot)com

> 1. The result of the test is valid only until we release the SLRU ControlLock,
> which we do before SlruScanDirCbDeleteCutoff() uses the cutoff to evaluate
> segments for deletion. Once we release that lock, latest_page_number can
> advance. This creates a TOCTOU race condition, allowing excess deletion:
>
>
> [local] test=# table trunc_clog_concurrency ;
> ERROR: could not access status of transaction 2149484247
> DETAIL: Could not open file "pg_xact/0801": No such file or directory.

It seems like some other vacuum process saw larger cutoff page? If I'm
not missing something, the missing page is no longer the
"recently-populated" page at the time (As I understand it as the last
page that holds valid data). Couldn't we just ignore ENOENT there?

> 2. By the time the "apparent wraparound" test fires, we've already WAL-logged
> the truncation. clog_redo() suppresses the "apparent wraparound" test,
> then deletes too much. Startup then fails:

I agree that if truncation is skipped after issuing log, it will
lead to data-loss at the next recovery. But the follwoing log..:

> 881997 2019-02-10 02:53:32.105 GMT FATAL: could not access status of transaction 708112327
> 881997 2019-02-10 02:53:32.105 GMT DETAIL: Could not open file "pg_xact/02A3": No such file or directory.
> 881855 2019-02-10 02:53:32.107 GMT LOG: startup process (PID 881997) exited with exit code 1

If it came from the same reason as 1, the log is simply ignorable, so
recovery stopping by the error is unacceptable, but the ENOENT is just
ignorable for the same reason.

As the result, I agree to (a) (fix rounding), and (c) (test
wrap-around before writing WAL) but I'm not sure for others. And
additional fix for ignorable ENOENT is needed.

What do you think about this?

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Martijn van Oosterhout 2019-07-24 08:30:12 Re: [PATCH] Improve performance of NOTIFY over many databases (issue blocking on AccessExclusiveLock on object 0 of class 1262 of database 0)
Previous Message Fabien COELHO 2019-07-24 08:23:44 Re: pgbench - allow to create partitioned tables