BUG #8167: false EINVAL -22 for opening a file

From: nelson(at)monkey(dot)org
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #8167: false EINVAL -22 for opening a file
Date: 2013-05-16 01:17:57
Message-ID: E1UcmpR-0004nn-2i@wrigleys.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

The following bug has been logged on the website:

Bug reference: 8167
Logged by: Nelson Minar
Email address: nelson(at)monkey(dot)org
PostgreSQL version: 9.2.4
Operating system: MacOS 10.8.3
Description:

This report and supporting files are available at
https://gist.github.com/NelsonMinar/5588719

I have a PostGIS insert/select operation that creates rows in a table.
Mostly these operations succeed, but every once in awhile one fails
with an error like
ERROR: could not open file "base/16384/24738_fsm": Invalid argument
Googling for this error shows it's occurred occasionally for other Mac
users, but I haven't seen any consistent explanation or solution.

I traced this down with the help of RhodiumToad and anders on IRC
#postgresql. Their guess was that the problem is that Postgres is
calling open() followed by free() before checking errno for open().
The call to free() subsequently calls madvise(), which throws EINVAL,
clobbering the ENOENT that open() set. When Postgres finally checks if
the open() worked it sees the EINVAL from the madvise() instead and
mistakenly reports a problem opening the file. It's unclear whether
the EINVAL from madvise() is a real error or just part of the normal
operation of free() on MacOS.

RhodiumToad asked me to report "pg in PathNameOpenFile / mdopen is
assuming that errno is preserved over calls to free() which is not
required by the spec" and "madvise is using MADV_FREE_REUSABLE".

I've attached dtruss output to this gist for both a successful query
and a failed query. Here's the essential part of the trace of a
failure:

open("base/16384/24738_fsm\0", 0x2, 0x180) = -1 Err#2
madvise(0x7FB983489000, 0x1000, 0x7) = -1 Err#22
sigprocmask(0x3, 0x7FFF580A0420, 0x0) = 0x0 0
sigreturn(0x0, 0x80000000, 0x0) = 0 0
write(0x2, "ERROR: could not open file \"base/16384/24738_fsm\": Invalid
argument\nSTATEMENT: insert into merged_rivers(gnis_id, name, strahler,
huc8, geometry)\n\t select\n\t
MAX(gnis_id) as gnis_id,\n\t MAX(name) as name", 0x204)
= 516 0
sendto(0xC, 0x7FB983841030, 0x65) = 101 0

Environment:

MacOS 10.8.3 Postgres 9.2.4 installed via Homebrew

I'm not positive if Postgres was built with gcc or clang. Here's the
compiler versions:

i686-apple-darwin11-llvm-gcc-4.2 (GCC) 4.2.1 (Based on Apple Inc.
build 5658) (LLVM build 2336.11.00)

Apple LLVM version 4.2 (clang-425.0.28) (based on LLVM 3.2svn) Target:
x86_64-apple-darwin12.3.0 Thread model: posix

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message weiwei 2013-05-16 09:14:23 Postgresql 9.12 support cluster mode
Previous Message Sean Chittenden 2013-05-15 20:54:52 WAL partition ran out of space, pg_subtrans & pg_clog partially written...

Browse pgsql-hackers by date

  From Date Subject
Next Message Jon Nelson 2013-05-16 02:14:33 Re: fallocate / posix_fallocate for new WAL file creation (etc...)
Previous Message Andres Freund 2013-05-16 01:15:10 Re: Heap truncation without AccessExclusiveLock (9.4)