Re: shared tempfile was not removed on statement_timeout

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, "Bossart, Nathan" <bossartn(at)amazon(dot)com>
Subject: Re: shared tempfile was not removed on statement_timeout
Date: 2020-07-21 04:32:58
Message-ID: 20200721043258.GE5748@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Dec 13, 2019 at 03:03:47PM +1300, Thomas Munro wrote:
> On Fri, Dec 13, 2019 at 7:05 AM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
> > I have a nagios check on ancient tempfiles, intended to catch debris left by
> > crashed processes. But triggered on this file:
> >
> > $ sudo find /var/lib/pgsql/12/data/base/pgsql_tmp -ls
> > 142977 4 drwxr-x--- 3 postgres postgres 4096 Dec 12 11:32 /var/lib/pgsql/12/data/base/pgsql_tmp
> > 169868 4 drwxr-x--- 2 postgres postgres 4096 Dec 7 01:35 /var/lib/pgsql/12/data/base/pgsql_tmp/pgsql_tmp11025.0.sharedfileset
> > 169347 5492 -rw-r----- 1 postgres postgres 5619712 Dec 7 01:35 /var/lib/pgsql/12/data/base/pgsql_tmp/pgsql_tmp11025.0.sharedfileset/0.0
> > 169346 5380 -rw-r----- 1 postgres postgres 5505024 Dec 7 01:35 /var/lib/pgsql/12/data/base/pgsql_tmp/pgsql_tmp11025.0.sharedfileset/1.0
> >
> > I found:
> > 2019-12-07 01:35:56 | 11025 | postgres | canceling statement due to statement timeout | CLUSTER pg_stat_database_snap USI
> > 2019-12-07 01:35:56 | 11025 | postgres | temporary file: path "base/pgsql_tmp/pgsql_tmp11025.0.sharedfileset/2.0", size 5455872 | CLUSTER pg_stat_database_snap USI
>
> Hmm. I played around with this and couldn't reproduce it, but I
> thought of something. What if the statement timeout is reached while
> we're in here:
>
> #0 PathNameDeleteTemporaryDir (dirname=0x7fffffffd010 "base/pgsql_tmp/pgsql_tmp28884.31.sharedfileset") at fd.c:1471
> #1 0x0000000000a32c77 in SharedFileSetDeleteAll (fileset=0x80182e2cc) at sharedfileset.c:177
> #2 0x0000000000a327e1 in SharedFileSetOnDetach (segment=0x80a6e62d8, datum=34385093324) at sharedfileset.c:206
> #3 0x0000000000a365ca in dsm_detach (seg=0x80a6e62d8) at dsm.c:684
> #4 0x000000000061621b in DestroyParallelContext (pcxt=0x80a708f20) at parallel.c:904
> #5 0x00000000005d97b3 in _bt_end_parallel (btleader=0x80fe9b4b0) at nbtsort.c:1473
> #6 0x00000000005d92f0 in btbuild (heap=0x80a7bc4c8, index=0x80a850a50, indexInfo=0x80fec1ab0) at nbtsort.c:340
...

> The CHECK_FOR_INTERRUPTS() inside the walkdir() loop could ereport()
> out of there after deleting some but not all of your files, but the
> code in dsm_detach() has already popped the callback (which it does
> "to avoid infinite error recursion"), so it won't run again on error
> cleanup. Hmm. But then... maybe the two log lines you quoted should
> be the other way around for that.

With inspired from re-reading your messages several times in rapid succession,
I was able to reproduce this easily with:

--- a/src/backend/storage/file/fd.c
+++ b/src/backend/storage/file/fd.c
@@ -3344,6 +3344,7 @@ walkdir(const char *path,
struct stat fst;
int sret;

+ usleep(99999);
CHECK_FOR_INTERRUPTS();

On Fri, Dec 13, 2019 at 03:49:26PM +1300, Thomas Munro wrote:
> Ok, so it looks like we shouldn't be relying on the same code path for
> 'happy' and 'error' cleanup. This could probably be fixed with a well
> placed explicit call to SharedFileSetDeleteAll() or a new function
> SharedFileSetDestroy(), and perhaps a flag in shmem to say it's been
> done so the callback doesn't do it again needlessly. I don't think
> this problem is specific to parallel index creation.

Find below a caveman-approved patch which avoids leaving behind tmpfiles.

I'm not sure how to do this cleanly, since:
| src/include/utils/tuplesort.h: * Tuplesortstate and Sharedsort are opaque types whose details are not

Maybe we need to add a new parameter like:
| tuplesort_end(Tuplesortstate *state, bool do_delete_fileset)

Arguably, that has the benefit that existing callers *have* to confront whether
they should delete the fileset or not. This is such a minor issue that it's
unfortunate to force a confrontation over it.

--
Justin

diff --git a/src/backend/access/nbtree/nbtsort.c b/src/backend/access/nbtree/nbtsort.c
index efee86784b..f5b0a48d64 100644
--- a/src/backend/access/nbtree/nbtsort.c
+++ b/src/backend/access/nbtree/nbtsort.c
@@ -511,12 +511,17 @@ _bt_spools_heapscan(Relation heap, Relation index, BTBuildState *buildstate,
return reltuples;
}

+extern void *tuplesort_shared_fileset(Tuplesortstate*);
+
/*
* clean up a spool structure and its substructures.
*/
static void
_bt_spooldestroy(BTSpool *btspool)
{
+ void *fileset = tuplesort_shared_fileset(btspool->sortstate);
+ if (fileset)
+ SharedFileSetDeleteAll(fileset);
tuplesort_end(btspool->sortstate);
pfree(btspool);
}
@@ -1669,6 +1674,10 @@ _bt_end_parallel(BTLeader *btleader)
/* Free last reference to MVCC snapshot, if one was used */
if (IsMVCCSnapshot(btleader->snapshot))
UnregisterSnapshot(btleader->snapshot);
+
+ // SharedFileSetDeleteAll(btleader->sharedsort->fileset);
+ // SharedFileSetDeleteAll(btleader->sharedsort2->fileset);
+
DestroyParallelContext(btleader->pcxt);
ExitParallelMode();
}
diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c
index 5f6420efb2..f89d42f475 100644
--- a/src/backend/storage/file/fd.c
+++ b/src/backend/storage/file/fd.c
@@ -3344,6 +3344,7 @@ walkdir(const char *path,
struct stat fst;
int sret;

+ usleep(99999);
CHECK_FOR_INTERRUPTS();

if (strcmp(de->d_name, ".") == 0 ||
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index 3c49476483..3de9592b78 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -1375,6 +1375,11 @@ tuplesort_free(Tuplesortstate *state)
MemoryContextReset(state->sortcontext);
}

+void *tuplesort_shared_fileset(Tuplesortstate *state)
+{
+ return state->shared ? &state->shared->fileset : NULL;
+}
+
/*
* tuplesort_end
*

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2020-07-21 04:33:25 Re: expose parallel leader in CSV and log_line_prefix
Previous Message Justin Pryzby 2020-07-21 04:12:31 Re: expose parallel leader in CSV and log_line_prefix