Re: Long paths for tablespace leads to uninterruptible hang in Windows

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Long paths for tablespace leads to uninterruptible hang in Windows
Date: 2013-10-14 15:10:24
Message-ID: CABUevExg2w3P7Khk3bb7A5wH33idtyPXA4MwPtPNVwYMu6x=9Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Oct 14, 2013 at 2:28 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Thu, Oct 10, 2013 at 9:34 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>> On further analysis, I found that hang occurs in some of Windows
>> API(FindFirstFile, RemoveDirectroy) when symlink path
>> (pg_tblspc/spcoid/TABLESPACE_VERSION_DIRECTORY) is used in these
>> API's. For above testcase, it will hang in path
>> destroy_tablespace_directories->ReadDir->readdir->FindFirstFile
>
> Well, that sucks. So it's a Windows bug.
>
>> Some of the ways to resolve the problem are described as below:
>>
>> 1. I found that if the link path is accessed as a full path during
>> readdir or stat, it works fine.
>>
>> For example in function destroy_tablespace_directories(), the path
>> used to access tablespace directory is of form
>> "pg_tblspc/16235/PG_9.4_201309051" by using below sprintf
>> sprintf(linkloc_with_version_dir,
>> "pg_tblspc/%u/%s",tablespaceoid,TABLESPACE_VERSION_DIRECTORY);
>> Now when it tries to access this path it is assumed in code that
>> corresponding OS API will take care of considering this path w.r.t
>> current working directory, which is right as per specs,
>> however as it hangs in OS API (FindFirstFile) if path length > 130 for
>> symlink and if try to use full path instead of starting with
>> pg_tblspc, it works fine.
>> So one way to resolve this issue is to use full path for symbolic link
>> path access instead of relying on OS to use full path.
>
> I'm not sure how we'd implement this, except by doing #2.

If we believe it's a Windows bug, perhaps a good start would be to
report it to Microsoft? There might be an "official workaround" for
it, or in fact, there might already exist a fix for it..

We're *probably* going to have to end up deploying a workaround, but
it would be a good idea to check first if they have a suggestion for
how...

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2013-10-14 15:11:05 Re: dynamic shared memory
Previous Message Andres Freund 2013-10-14 13:51:14 Re: logical changeset generation v6.2