Re: [HACKERS] Tablespaces

From: jearl(at)bullysports(dot)com
To: "Thomas Swan" <tswan(at)idigx(dot)com>
Cc: "Bruce Momjian" <pgman(at)candle(dot)pha(dot)pa(dot)us>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Zeugswetter Andreas SB SD" <zeugswettera(at)spardat(dot)at>, "Greg Stark" <gsstark(at)mit(dot)edu>, <pgsql-hackers(at)postgresql(dot)org>, "PostgreSQL Win32 port list" <pgsql-hackers-win32(at)postgresql(dot)org>
Subject: Re: [HACKERS] Tablespaces
Date: 2004-03-05 18:54:00
Message-ID: 3c8np1kn.fsf@bullysports.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-hackers-win32

"Thomas Swan" <tswan(at)idigx(dot)com> writes:

> jearl(at)bullysports(dot)com wrote:

[snip]

> Apparently, I have failed tremendously in addressing a concern. The
> question is does PostgreSQL need to rely on symlinks and will that
> dependency introduce problems?
>
> There is an active win32 port underway (see this mailing list).  
> One proposal was to try to use an OS specific filesystem feature to
> perform a symlink on NTFS.  Can the special symlink that NTFS
> allegedly supports be archived the same way symlinks are archived on
> Unix?  If so, is there a utility that can do this (zip, tar, etc). 
> The backup operator would still need to know what directories needed
> to be archived in addtion to the pgdata directory.    Is this
> symlink structure a normal/special file that can be archived by
> normal means (tar,zip, etc)?
>
> Example:
>
> PGDATA is C:\pgdata
> I have a tablespace in Z:\1\ and Z:\2\
> There exists an alleged symlink in
> C:\pgdata\data\base\tablespaces\schmoo -> Z:\1
>
> Can I archive [ C:\pgdata, Z:\1, Z:\2 ], restore them, and have
> postgresql working just as before?

My point is that if you are using some sort of sytem table, catalog
file, or whatever else (besides symlinks), you are still going to have
to backup up c:\pgdata z:\1 and z:\2 manually anyhow. Unless, of
course, you have some magical version of tar or zip that reads the
PostgreSQL specific configuration file and somehow "does the right
thing." At least with symlinks you have a fighting chance on some
platforms of being able to do:

tar zcvf /tmp/backup.tgz /usr/local/postgres/pgdata/

and have it do the right thing. Using PostgreSQL specific catalogs
you would force UNIX users (the majority of PostgreSQL users right
now) to back up their tablespaces manually. Forcing platforms with
symlinks to use your wacky symlink replacement just guarantees that
all platforms work equally poorly. It doesn't make the Win32 port any
better. You would still have to backup c:\pgdata z:\1\ and z:\2\
separately on Win32. The only difference is that now your misery
would have the company of all of the rest of us.

>>>It seems a little insane to introduce an OS/filesystem dependency
>>>at the onset of a porting effort especially if you hope to be OS
>>>agnostic for feature sets.  I think someone would be crying foul if
>>>a new feature only worked on Linux and not on FreeBSD.    
>>>
>>
>>First of all, symlinks are a pretty popular "feature."  Even Windows
>>supports what would be needed.  Second of all, PostgreSQL will still
>>run on OSes without symlinks, tablespaces won't be available, but
>>PostgreSQL will still run.  Since we are all using PostgreSQL
>>without tablespaces now, it can hardly be argued that tablespaces
>>are a critical feature.
>>
>>We aren't talking about a "feature that work[s] on Linux on not on
>>FreeBSD."  We are talking about a feature that works on every OS
>>that suports symlinks (which includes even operating systems like
>>Windows that PostgreSQL doesn't currently support).
>
> Hello?  What was this response from Tom Lane? "My feeling is that we
> need not support tablespaces on OS's without symlinks."  That seems to
> be indicative of a feature set restriction base on platform.

Tom Lane works for Red Hat. You can't hardly expect him to spend all
of his time working around the limitations of the competition's
operating system.

>>>Additionally, another developer noted the advantage of a text file
>>>is that it would be easy for someone to develop tools to help if it
>>>became difficult to edit or parse.  Additionally, there could be a
>>>change away from a flat file format to an XML format to configure
>>>the tablespace area.
>>
>>The advantage of symlinks is that no tools would have to be written
>>and 'ls -l' would show everything you would need to know about where
>>your tablespaces actually were.
>
> Where is 'ls -l' on a win32 box?  If you will follow the discussion
> of symlinks under MinGW you will see that they don't work as
> commanded. And, postgresql is supposed to be compiled under MinGW,
> but not require it to run.
>
> From Windows 2000, 'ls' is not recognized as an internal or external
> command, operable program or batch file.

Yes, Windows lacks 'ls', but it has similar tools.

>>XML files are relatively easy to parse, but they certainly aren't as
>>easy as simply letting PostgreSQL follow a symlink.  Why reinvent the
>>wheel with what would essentially be PostgreSQL's own implementation
>>of a symlink?
>> 
>>
>
> Is opening a file recreating a symlink?  If you are opening file
> descriptors why rely on symlinks.  If you know the location either
> from the system catalog, a or configuration file, is it any terribly
> more complicated?   Basically, if a tablespace needed to be renamed,
> or moved, or changed, your going to have to do file management
> anyway.  The symlink saves you just a lookup as to what files go
> where?  If you kept this small hash in memory, it's not a continuous
> lookup because you have the redirection internally.  And, it's more
> portable.

Yes, and the implementation of tablespaces apparently keeps track of
all of that in the system catalogs. In other words, the new table
space mechanism will be fancier than what we currently have (which is
just the ability to move the files themselves somewhere else and
create symlinks).

What the symlink does is allow the core part of PostgreSQL to pretend
that the files are still in one directory just like they have always
been. It also allows folks using OSes that support symlinks to be
able to simply tar up the directory :).

>>To go back to your *tar* example, are we going to rewrite tar so that
>>it reads PostgreSQL's XML file and *does the right thing*?
>
> I'm not talking about integrating tar with PostgreSQL or uniting the
> universal string theory with how to cook apple pie.  Why would think
> we would have to rewrite tar?

Your tar example was the only argument that you made against symlinks
that made any sense at all. Here's an example that you wanted to see
work.

Example:

PGDATA is C:\pgdata
I have a tablespace in Z:\1\ and Z:\2\
There exists an alleged symlink in
C:\pgdata\data\base\tablespaces\schmoo -> Z:\1

Can I archive [ C:\pgdata, Z:\1, Z:\2 ], restore them, and have
postgresql working just as before?

Now, using some sort of XML catalog can you archive [ C:\pgdata, Z:\1,
Z:\2 ] and have PostgreSQL working just as before?

The answer to that question is *no*, unless of course you read the XML
file yourself, archive each tablespace manually, and then restore it
all piece by piece. After all, zip or tar doesn't know anything about
your XML file, and so it doesn't have a clue about your various
tablespaces outside of c:\pgdata. Since this is *precisely* what you
would have to do with the symlink implementation (the difference being
that you would have to read the system catalogs instead of an XML
file) what's the point in handcapping systems with symlinks that work?

[snip]

> The other option proposed was to give win32 a subset of features that
> would be available to other platforms.  In this case, that would be that
> the win32 port could support tablespaces.  This is strikingly different
> than a performance issue.   It would be one thing for tablespaces to
> perform poorly, it's another for them to fail or not exist altogether.
>
>>Perhaps if you could give us an example of an actual case where some
>>actual PostgreSQL users (or potential users) might be affected?
>
> See the comment from Tom Lane on limiting features.  Look at the
> potential Win32 market which outnumbers the unix market in number of
> computers and developers by a large margin.

I *agree* with Tom Lane. Personally I am more concerned about
PostgreSQL running well on Linux (the platform I use) then I am about
being able to possibly win potential Windows installations. As far as
I am concerned my use of PostgreSQL on Linux gives me a competitive
advantage :). I see no reason to dumb down the Linux version of
PostgreSQL simply so that I can share the misery that a Windows user
has to face on a daily basis.

Besides which, directory symlinks actually do exist on Windows. I
have spent a bit of time playing with sysinternals 'junction' program,
and while not quite as cool as symlinks, it would certainly work.

Jason

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2004-03-05 20:00:18 7.4.2 release notes
Previous Message Josh Berkus 2004-03-05 17:54:40 Re: [HACKERS] Regression tests on Nintendo Game Cube

Browse pgsql-hackers-win32 by date

  From Date Subject
Next Message Tom Lane 2004-03-05 21:48:58 Re: [HACKERS] Another crack at doing a Win32
Previous Message Lawrence E. Smithmier, Jr. 2004-03-05 17:41:24 Re: [HACKERS] Tablespaces