Re: BUG #17543: CSVLOG malformed from disk space error

From: Noah Misch <noah(at)leadboat(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, "nathanfballance(at)gmail(dot)com" <nathanfballance(at)gmail(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #17543: CSVLOG malformed from disk space error
Date: 2022-08-21 19:07:28
Message-ID: 20220821190728.GA631665@rfd.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Tue, Jul 12, 2022 at 05:48:53PM -0700, Andres Freund wrote:
> On 2022-07-11 05:31:35 -0700, David G. Johnston wrote:
> > On Saturday, July 9, 2022, PG Bug reporting form <noreply(at)postgresql(dot)org> wrote:
> > > Postgresql server with csvlog log_destination enabled will have malformed
> > > CSV upon a disk space error. This causes any loading of the malformed *.csv
> > > log file to error
> > >
> > > ASK: Can the CSV file be written to in a safer way which ensures proper
> > > format even upon disk error?

> > I’d have to say that there is little interest in sacrificing performance
> > for safety here, which seems like an unavoidable proposition.
>
> I agree in general, but this specific issue seems easy enough to address. We
> could just track whether the last write failed, and if so, emit an additional
> newline.
>
> But that just fixes the simple case - if the last successful write contained
> the start of an escaped string, the newline won't necessarily be recognized as
> the end of a "row".

Here's one approach avoiding that problem. After ENOSPC causes the logfile to
end with a prefix of a message, issue ftruncate(logfile, logfile_length -
written_bytes_of_message_prefix).

An alternative would be to periodically posix_fallocate() substantial space in
the logfile, and write messages only to already-allocated space. At rotation,
clean shutdown, or startup, ftruncate() away trailing NUL bytes. I figure
this is inferior to the other approach, because the trailing NUL bytes will be
user-visible after OS crashes and when tailing active logs.

(Neither approach prevents CSV corruption if the OS crashes in the middle of
syslogger's processing of one record. I don't know a low-cost, general fix
for that. One tough case is a field that should have been "foo""bar" getting
truncated to "foo".)

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Richard Guo 2022-08-22 03:29:47 Re: foreign join error "variable not found in subplan target list"
Previous Message Tom Lane 2022-08-21 17:50:35 Re: BUG #17233: Incorrect behavior of DELETE command with bad subquery in WHERE clause