Permission denied on fsync / Win32 (was right sibling is not next child)

From: "Peter Brant" <Peter(dot)Brant(at)wicourts(dot)gov>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-bugs(at)postgresql(dot)org>
Subject: Permission denied on fsync / Win32 (was right sibling is not next child)
Date: 2006-04-13 15:49:52
Message-ID: 443E2CD0020000BE00002D43@gwmta.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

It turns out we've been getting rather huge numbers of "Permission
denied" errors relating to fsync so perhaps it wasn't really a precursor
to the crash as I'd previously thought.

I've pasted in a complete list following this email covering the time
span from 3/20 to 4/6. The number in the first column is the number of
times the given log message appeared.

The interesting thing is that _none_ of the referenced relfilenode
numbers actually appear in the file system.

In a possibly related note, I've confirmed there is a race condition on
Windows when the temporary stats file is renamed to the working one.
There is apparently a window where a backend opening the stats file will
find it missing. I'll send more info later. I haven't had a chance to
come back to it yet.

Regarding your other questions:
- The file system is NTFS
- Regarding the initial crash, looking more closely, I don't think it
was a crash at all (or at most it was a human-induced "crash"). In the
log everything looks normal (assuming the Permission denied errors are
"normal"), and then

[2006-03-31 14:26:30.705 ] 2328 LOG: received fast shutdown request
[2006-03-31 14:35:33.173 ] 4016 <viewer bigbird 127.0.0.1(3413)> FATAL:
the database system is shutting down
[2006-03-31 14:35:33.189 ] 6504 <viewer bigbird 127.0.0.1(3414)> FATAL:
the database system is shutting down
....
[2006-03-31 14:39:53.595 ] 7576 <viewer bigbird 165.219.80.39(36220)>
FATAL: the database system is shutting down

and in the next log file

[2006-03-31 14:31:05.298 ] 608 LOG: database system was interrupted at
2006-03-31 13:20:06 Central Standard Time
[2006-03-31 14:31:05.314 ] 608 LOG: checkpoint record is at
EF/B41D7580
[2006-03-31 14:31:05.314 ] 608 LOG: redo record is at EF/B41A0C08;
undo record is at 0/0; shutdown FALSE
[2006-03-31 14:31:05.314 ] 608 LOG: next transaction ID: 295492806;
next OID: 1395901
[2006-03-31 14:31:05.314 ] 608 LOG: next MultiXactId: 1; next
MultiXactOffset: 0
[2006-03-31 14:31:05.314 ] 608 LOG: database system was not properly
shut down; automatic recovery in progress

Pete

747 LOG: could not fsync segment 0 of relation 1663/16385/1361661:
Permission denied
414 LOG: could not fsync segment 0 of relation 1663/16385/1363194:
Permission denied
2 LOG: could not fsync segment 0 of relation 1663/16385/1363196:
Permission denied
441 LOG: could not fsync segment 0 of relation 1663/16385/1369401:
Permission denied
4520 LOG: could not fsync segment 0 of relation 1663/16385/1373027:
Permission denied
1024 LOG: could not fsync segment 0 of relation 1663/16385/1374375:
Permission denied
2683 LOG: could not fsync segment 0 of relation 1663/16385/1375726:
Permission denied
775 LOG: could not fsync segment 0 of relation 1663/16385/1375733:
Permission denied
83 LOG: could not fsync segment 0 of relation 1663/16385/1377367:
Permission denied
64 LOG: could not fsync segment 0 of relation 1663/16385/1377685:
Permission denied
3334 LOG: could not fsync segment 0 of relation 1663/16385/1379641:
Permission denied
16 LOG: could not fsync segment 0 of relation 1663/16385/1381290:
Permission denied
819 LOG: could not fsync segment 0 of relation 1663/16385/1383833:
Permission denied
347 LOG: could not fsync segment 0 of relation 1663/16385/1386037:
Permission denied
1 LOG: could not fsync segment 0 of relation 1663/16385/1388257:
Permission denied
135 LOG: could not fsync segment 0 of relation 1663/16385/1388264:
Permission denied
6397 LOG: could not fsync segment 0 of relation 1663/16385/1389813:
Permission denied
3719 LOG: could not fsync segment 0 of relation 1663/16385/1391589:
Permission denied
763 LOG: could not fsync segment 0 of relation 1663/16385/1391610:
Permission denied
10784 LOG: could not fsync segment 0 of relation 1663/16385/1392439:
Permission denied
5846 LOG: could not fsync segment 0 of relation 1663/16385/1392444:
Permission denied
1 LOG: could not fsync segment 0 of relation 1663/16385/2282587:
Permission denied
243 LOG: could not fsync segment 0 of relation 1663/16385/2282602:
Permission denied
526 LOG: could not fsync segment 0 of relation 1663/16385/2293690:
Permission denied
1754 LOG: could not fsync segment 0 of relation 1663/16385/2293695:
Permission denied
894 LOG: could not fsync segment 0 of relation 1663/16385/2304886:
Permission denied
692 LOG: could not fsync segment 0 of relation 1663/16385/2304891:
Permission denied
1556 LOG: could not fsync segment 0 of relation 1663/16385/2315303:
Permission denied

>>> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> 04/13/06 2:30 am >>>
crash at 14:31? What was the immediate cause of the crash (I'm
expecting a PANIC or possibly an Assert triggered it)?

BTW, what sort of filesystem is the database sitting on, on this
machine?

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Stephan Szabo 2006-04-13 16:01:57 Re: BUG #2390: check constraint
Previous Message Tom Lane 2006-04-13 01:30:59 Re: right sibling is not next child