Re: [Patch] Optimize dropping of relation buffers using dlist

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "k(dot)jamison(at)fujitsu(dot)com" <k(dot)jamison(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [Patch] Optimize dropping of relation buffers using dlist
Date: 2020-10-22 21:45:05
Message-ID: CA+hUKGJ7Xv9NqREGjxm4VO5e=42YPs_XJZ-sHPdhejo3341_BQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Oct 22, 2020 at 8:32 PM tsunakawa(dot)takay(at)fujitsu(dot)com
<tsunakawa(dot)takay(at)fujitsu(dot)com> wrote:
> I'm probably being silly, but can't we avoid the problem by using fstat() instead of lseek(SEEK_END)? Would they return the same value from the i-node?

Amazingly, st_size can disagree with SEEK_END when using the Linux NFS
client, but its behaviour is worse. Here's a sequence from a Linux
NFS client talking to a Linux NFS server with no free space. This
time, I also replaced the fsync() with sleep(60), just to make it
clear that SEEK_END offset can move at any time due to asynchronous
activity in kernel threads:

lseek(..., SEEK_END) = 9670656
fstat(...) = 0, st_size = 9670656

write(...) = 8192
lseek(..., SEEK_END) = 9678848
fstat(...) = 0, st_size = 9670656 (*1)

sleep(...) = 0

lseek(..., SEEK_END) = 9670656 (*2)
fstat(...) = 0, st_size = 9670656

fsync(...) = -1
lseek(..., SEEK_END) = 9670656
fstat(...) = 0, st_size = 9670656
fsync(...) = 0

However, I'm not entirely sure which phenomena visible here to blame
on which subsystems, and therefore which things to expect on local
filesystems, or on other operating systems. I can say that with a
FreeBSD NFS client and the same Linux NFS server, I don't see
phenomenon *1 (unsurprising) but I do see phenomenon *2 (surprising to
me).

> Or, can't we just try to do BufTableLookup() one block after what smgrnblocks() returns?

Unfortunately the problem isn't limited to one block.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2020-10-22 21:45:43 Re: new heapcheck contrib module
Previous Message Mark Dilger 2020-10-22 21:39:10 Re: new heapcheck contrib module