Re: Patch to remove sort files, temp tables, unreferenced files

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: Patch to remove sort files, temp tables, unreferenced files
Date: 2001-05-30 03:46:49
Message-ID: 200105300346.f4U3knr08324@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches

> Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> > We can't just throw up our hands and say we can't account for this
> > left-over stuff. We have to come up with some solution.
>
> No we don't. I'm not convinced that we have a problem at all anymore.
> (Show me field reports from 7.1; older versions are irrelevant to this
> discussion.) But I am convinced that a half-baked solution would be
> worse than not doing anything.

We already know that crashes will leave these files around, and as we
start using file versioning for CLUSTER and DROP COLUMN, I expect those
orphaned files to continue.

Basically, if I can, I would like to harden our code to report or remove
this cruft. Most commercial database have such tools. Seems VACUUM is
the proper place for us to do it.

However, I agree if I can't make it 100% reliable, it is worse than
useless.

Let me keep trying and we can rip out parts that aren't 100%.

> >> 1. Removing sorttemp files during vacuum is not safe; you cannot know
> >> that they don't belong to any running backend. (No, the fact that you
> >> looked in PROC and didn't find the PID doesn't prove anything; the same
> >> PID could have been reassigned to a new backend since you looked.)
>
> > But I have the directory entry before I look for the PID. Seems safe.
>
> No. PIDs wrap around, therefore sorttemp file names recycle. You might
> be able to establish that the backend that originally created the file
> is gone, but it's still possible that by the time you actually do the
> unlink, you are cutting the knees off a new backend that has re-used the
> file name.
>
> Basically, you cannot do any of this safely without a lock. If you
> think you can, then you don't understand the problem.

You are correct. I didn't realized that after I check the shared memory
for the pid, another backend could start with the pid I was checking and
create a sort file. The sort code doesn't so O_EXCL, so it would
succeed in creating it, and I would have removed it.

> I don't think there are any suitable locks in the system at the moment,
> and in any case I don't like the loss of concurrency that would occur
> if we interlocked process start/stop, sorttemp file
> creation/destruction, etc, to the extent that would be needed to make
> the world safe for on-line temp file removal. This cure looks much
> worse than the disease to me...

OK, my solution was to do this:

else if (ShmemPIDLookup(pid, true) == NULL)
{
snprintf(temp_path, MAXPGPATH,
"%s/%s/%s", cwd, SORT_TEMP_DIR, temp_de->d_name);
/* Make sure no pid gets created and starts using this file */
SpinAcquire(ShmemIndexLock);
if (ShmemPIDLookup(pid, false) == NULL)
unlink(temp_path);
SpinRelease(ShmemIndexLock);
}

I added a boolean flag to ShmemPIDLookup() to control whether the
function does locking. Basically, I do my own locking around the unlink
if I have found a orphaned file. This prevents another backend from
being created under me with the pid I am checking. This code only runs
if I have already found an orphan.

>
> >> BTW, while I approve of moving sorttemp files into their own
> >> subdirectory, it's sheer folly to give them names therein that look
> >> so much like regular relation file names. They should still be named
> >> something like "sorttempNNN".
>
> > They are in their own directory. Naming as backend_pid.serial number seems
> > OK to me.
>
> The code won't get confused, but humans might. Keep in mind also that
> the reason for having a separate directory is so that a DBA can symlink
> the directory to someplace else. What happens if he symlinks it to a
> directory that also contains real data files? Far better to make sure
> that temp file names cannot conflict with relation file names, whether
> they're in the same directory or not.

See attached patch. pid_###.###.

Also, I re-added the startOid/nextoid code to prevent me from looking at
any pg_class changes caused by running backends.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026

Attachment Content-Type Size
unknown_filename text/plain 26.1 KB

In response to

Responses

Browse pgsql-patches by date

  From Date Subject
Next Message Christopher Kings-Lynne 2001-05-30 04:17:15 DROP CONSTRAINT docs patch
Previous Message Bruce Momjian 2001-05-30 00:51:43 Re: Fix for alter table add constraint inheritance