Re: Race in "tablespace" test on Windows

From: Noah Misch <noah(at)leadboat(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Race in "tablespace" test on Windows
Date: 2014-11-13 03:16:19
Message-ID: 20141113031619.GB781371@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Nov 11, 2014 at 10:21:26AM +0530, Amit Kapila wrote:
> On Sat, Nov 8, 2014 at 10:34 AM, Noah Misch <noah(at)leadboat(dot)com> wrote:
> > Here is a briefer command sequence exhibiting the same problem:
> >
> > CREATE TABLESPACE testspace LOCATION '...somewhere...';
> > CREATE TABLE atable (c int) tablespace testspace;
> > SELECT COUNT(*) FROM atable; -- open heap
> > \c -
> > ALTER TABLE atable SET TABLESPACE pg_default;
> > DROP TABLESPACE testspace; -- bug: fails sometimes
> > DROP TABLESPACE testspace; -- second one ~always works
> > DROP TABLE atable;
> >
>
> For me, it doesn't get success even second time, I am getting
> the same error until I execute some command on first session
> which means till first session has processed the invalidation
> messages.
>
> postgres=# Drop tablespace tbs;
> ERROR: tablespace "tbs" is not empty
> postgres=# Drop tablespace tbs;
> ERROR: tablespace "tbs" is not empty
>
> I have tested this on Windows 7.

The behavior you see makes sense if you have a third, idle backend. I had
only the initial backend and the "\c"-created second one.

> > To make this work as well on Windows as it does elsewhere, DROP TABLESPACE
> > would need to wait for other backends to close relevant unlinked files.
> > Perhaps implement "wait_unlinked_files(const char *dirname)" to poll
> unlinked,
> > open files until they disappear. (An attempt to open an unlinked file
> reports
> > ERROR_ACCESS_DENIED. It might be tricky to reliably distinguish this
> cause
> > from other causes of that error, but it should be possible.)
>
> I think the proposed mechanism can work but the wait can be very long
> (untill the backend holding descriptor executes another command).

The DROP TABLESPACE could send a catchup interrupt.

> Can we think of some other solution like in Drop Tablespace instead of
> checking if directory is empty, check if there is no object that belongs
> to database/cluster, then allow to forcibly delete that directory someway.

I'm not aware of a way to forcibly delete the directory. One could rename
files to the tablespace top-level directory just before unlinking them. Since
DROP TABLESPACE never removes that directory, their continued presence there
would not pose a problem. (Compare use of the rename-before-unlink trick in
RemoveOldXlogFiles().) That adds the overhead of an additional system call to
every unlink, which might be acceptable. It may be possible to rename after
unlink, as-needed in DROP TABLESPACE.

> > I propose to add
> > this as a TODO, then bandage the test case with s/^\\c -$/RESET ROLE;/.
>
> Yeah, this make sense.

Done.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2014-11-13 03:38:10 Re: pg_receivexlog --status-interval add fsync feedback
Previous Message Fujii Masao 2014-11-13 03:15:40 Re: PENDING_LIST_CLEANUP_SIZE - maximum size of GIN pending list Re: HEAD seems to generate larger WAL regarding GIN index