From: | Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com> |
---|---|
To: | Aleksander Alekseev <aleksander(at)tigerdata(dot)com> |
Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org, Michael Banck <mbanck(at)gmx(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Frits Hoogland <frits(dot)hoogland(at)gmail(dot)com> |
Subject: | Re: The ability of postgres to determine loss of files of the main fork |
Date: | 2025-10-01 12:05:53 |
Message-ID: | CAKZiRmy0CK3m0-raCdTDELg0JjY7qAqzEN9P5n4N4wGw6ys4tw@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Oct 1, 2025 at 1:46 PM Aleksander Alekseev
<aleksander(at)tigerdata(dot)com> wrote:
>
> Hi Jakub,
>
> > IMHO all files should be opened at least on startup to check integrity,
>
> That might be a lot of files to open.
I was afraid of that, but let's say modern high-end is 200TB big DB,
that's like 200*1024 1GB files, but I'm getting such time(1) timings
for 204k files on ext4:
$ time ./createfiles # real 0m2.157s, it's
open(O_CREAT)+close()
$ time ls -l many_files_dir/ > /dev/null # real 0m0.734s
$ time ./openfiles # real 0m0.297s , for
already existing ones (hot)
$ time ./openfiles # real 0m1.456s , for
already existing ones (cold, echo 3 > drop_caches sysctl)
Not bad in my book as a one time activity. It could pose a problem
potentially with some high latency open() calls, maybe NFS or
something remote I guess.
> Even if you can open a file it doesn't mean it's not empty
Correct, I haven't investigated that rabbithole...
> or is not corrupted.
I think checksums guard users well in this case as they would get
notified that stuff is wonky (much better than wrong result/silent
data loss)
-J.
From | Date | Subject | |
---|---|---|---|
Next Message | Nazir Bilal Yavuz | 2025-10-01 12:09:27 | Re: split func.sgml to separated individual sgml files |
Previous Message | Frits Hoogland | 2025-10-01 12:02:50 | Re: The ability of postgres to determine loss of files of the main fork |