Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, Bruce Momjian <bruce(at)momjian(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Anthony Iliopoulos <ailiop(at)altatus(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Catalin Iacob <iacobcatalin(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
Date: 2018-04-08 03:37:06
Message-ID: CAH2-Wz=vJ4KoRfa=sdYkqrrzPTp1D7yYa9CJXTGn-DAg4zVrpw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Apr 7, 2018 at 8:27 PM, Craig Ringer <craig(at)2ndquadrant(dot)com> wrote:
> More below, but here's an idea #5: decide InnoDB has the right idea, and go
> to using a single massive blob file, or a few giant blobs.
>
> We have a storage abstraction that makes this way, way less painful than it
> should be.
>
> We can virtualize relfilenodes into storage extents in relatively few big
> files. We could use sparse regions to make the addressing more convenient,
> but that makes copying and backup painful, so I'd rather not.
>
> Even one file per tablespace for persistent relation heaps, another for
> indexes, another for each fork type.

I'm not sure that we can do that now, since it would break the new
"Optimize btree insertions for common case of increasing values"
optimization. (I did mention this before it went in.)

I've asked Pavan to at least add a note to the nbtree README that
explains the high level theory behind the optimization, as part of
post-commit clean-up. I'll ask him to say something about how it might
affect extent-based storage, too.

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2018-04-08 03:42:23 Re: pgsql: Support partition pruning at execution time
Previous Message Andrew Gierth 2018-04-08 03:34:22 Re: pgsql: Support partition pruning at execution time