Re: Direct I/O

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Noah Misch <noah(at)leadboat(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Direct I/O
Date: 2023-04-09 02:56:53
Message-ID: CA+hUKG+F8HgEoimV-42bFXJbsd4yeQ6DF1VEc2LZ4bB-OfcV6Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Apr 9, 2023 at 2:18 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> On 2023-04-09 13:55:33 +1200, Thomas Munro wrote:
> > I think that particular thing might relate to modifications of the
> > user buffer while a write is in progress (breaking btrfs's internal
> > checksums). I don't think we should ever do that ourselves (not least
> > because it'd break our own checksums). We lock the page during the
> > write so no one can do that, and then we sleep in a synchronous
> > syscall.
>
> Oh, but we actually *do* modify pages while IO is going on. I wonder if you
> hit the jack pot here. The content lock doesn't prevent hint bit
> writes. That's why we copy the page to temporary memory when computing
> checksums.

More like the jackpot hit me.

Woo, I can now reproduce this locally on a loop filesystem.
Previously I had missed a step, the parallel worker seems to be
necessary. More soon.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Gregory Stark (as CFM) 2023-04-09 03:01:24 Re: Use fadvise in wal replay
Previous Message Andres Freund 2023-04-09 02:18:09 Re: Direct I/O