Error handling (or lack of it) in RemovePgTempFilesInDir

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Error handling (or lack of it) in RemovePgTempFilesInDir
Date: 2017-12-04 15:44:14
Message-ID: 19907.1512402254@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

The recent changes in commit dc6c4c9dc caused Coverity to start
complaining that RemovePgTempFilesInDir calls rmdir() without checking
its return value, as would be good practice. Now, this wasn't really a
fault of that commit, because the code was already ignoring the result of
unlink(), but it did cause me to wonder why we're ignoring either one.
Even granted that we don't want to throw ERROR there (because this code
runs in the postmaster, and an ERROR would force postmaster exit), our
usual coding practice would be to print a LOG message about the failure.

Tracing back, the decision to ignore errors altogether seems to originate
with commit 2a6f7ac45, which added this code:

if (strncmp(temp_de->d_name,
PG_TEMP_FILE_PREFIX,
strlen(PG_TEMP_FILE_PREFIX)) == 0)
{
unlink(rm_path);
}
else
{
/*
* would prefer to use elog here, but it's not
* up and running during postmaster startup...
*/
fprintf(stderr,
"Unexpected file found in temporary-files directory: %s\n",
rm_path);
}

I'm not real sure that the bit about "elog not being up" was true even
at the time, and it's definitely wrong now; but perhaps thinking that
it wasn't up explains why we chose not to whine about unlink failure.
Or maybe there was no thought that it could fail at all --- it's surely
odd to complain about "this file shouldn't be here" but not about
"I should be able to remove this file but failed to".

Anyway, I'm inclined to reverse that choice and emit LOG messages
reporting failure of any of the lstat, rmdir, or unlink calls in
RemovePgTempFilesInDir. In the worst case, say that there are a
bunch of leftover temp files in a directory that someone has made
unwritable, this might cause a fair amount of whining in the postmaster
log --- but that's a situation that needs to be fixed anyway, so
I cannot see how not printing anything is a good idea.

I'm also inclined to convert the ReadDir calls in this function to be
ReadDirExtended(..., LOG), as per Michael's proposal in
https://www.postgresql.org/message-id/CAB7nPqRpOCxjiirHmebEFhXVTK7V5Jvw4bz82p7Oimtsm3TyZA@mail.gmail.com
If we don't want to throw a hard error for failure to remove files,
it seems like throwing an error for failure to read a temp dir isn't
a great choice either.

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Konstantin Knizhnik 2017-12-04 15:46:07 Re: [HACKERS] Cached plans and statement generalization
Previous Message Pavel Stehule 2017-12-04 15:42:41 Re: [HACKERS] proposal: psql command \graw