Re: fd,c just Assert()s that lseek() succeeds

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: fd,c just Assert()s that lseek() succeeds
Date: 2017-02-20 19:02:45
Message-ID: 2982.1487617365@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> I noticed while researching bug #14555 that fd.c contains two separate
> cases like
> vfdP->seekPos = lseek(vfdP->fd, (off_t) 0, SEEK_CUR);
> Assert(vfdP->seekPos != (off_t) -1);
> This seems, um, unwise. It might somehow fail to fail in production
> builds, because elsewhere it's assumed that -1 means FileUnknownPos,
> but it seems like reporting the error would be a better plan.

I looked at things more closely and decided that this was really pretty
seriously broken. Attached is a patch that attempts to spruce things up.

In LruDelete, it's more or less okay to not throw an error, mainly because
that's likely to get called during error abort and so throwing an error
would probably just lead to an infinite loop. But by the same token,
throwing an error from the close() right after that is ill-advised, not to
mention that it would've left the LRU state corrupted since we'd already
unlinked the VFD. I also noticed that really, most of the time, we should
know the current seek position and it shouldn't be necessary to do an
lseek here at all. In the attached, if we don't have a seek position and
an lseek attempt doesn't give us one, we'll close the file but then
subsequent re-open attempts will fail (except in the somewhat-unlikely
case that a FileSeek(SEEK_SET) call comes between and allows us to
re-establish a known seek position).

Meanwhile, having an Assert instead of an honest test in LruInsert is
really dangerous: if that lseek failed, a subsequent read or write would
read or write from the start of the file, not where the caller expected,
leading to data corruption.

I also fixed a number of other places that were being sloppy about
behaving correctly when the seekPos is unknown.

Also, I changed FileSeek to return -1 with EINVAL for the cases where it
detects a bad offset, rather than throwing a hard elog(ERROR). It seemed
pretty inconsistent that some bad-offset cases would get a failure return
while others got elog(ERROR). It was missing an offset validity check for
the SEEK_CUR case on a closed file, too.

I'm inclined to treat this as a bug fix and back-patch it. lseek()
failure on a valid FD is certainly unlikely, but if it did happen our
behavior would be pretty bad. I'm also suspicious that some of these bugs
could be exposed even without an lseek() failure, because of the code in
FileRead and FileWrite that will reset seekPos to "unknown" after an
error.

regards, tom lane

Attachment Content-Type Size
fix-seek-position-sloppiness.patch text/x-diff 12.9 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-02-20 19:04:50 Re: Parallel bitmap heap scan
Previous Message Pavel Stehule 2017-02-20 18:48:18 possible encoding issues with libxml2 functions