Re: avoid multiple hard links to same WAL file after a crash

From: Greg Stark <stark(at)mit(dot)edu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Robert Haas <robertmhaas(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: avoid multiple hard links to same WAL file after a crash
Date: 2022-04-18 20:53:53
Message-ID: CAM-w4HMOcaApY8qiedkCKRGzfoviLgZrU_Eui=_wM1hV8k=d1Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

The readdir interface allows processes to be in the middle of reading
a directory and unless a kernel was happy to either materialize the
entire directory list when the readdir starts, or lock the entire
directory against modification for the entire time the a process has a
readdir fd open it's always going to be possible for the a process to
have previously read the old directory entry and later see the new
directory entry. Kernels don't do any MVCC or cmin type of games so
they're not going to be able to prevent it.

What's worse of course is that it may only happen in very large
directories. Most directories fit on a single block and readdir may
buffer up all the entries a block at a time for efficiency. So it may
only be visible on very large directories that span multiple blocks.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David G. Johnston 2022-04-18 21:00:17 Re: Dump/Restore of non-default PKs
Previous Message Tom Lane 2022-04-18 20:48:07 Re: Dump/Restore of non-default PKs