Re: Storing many big files in database- should I do it?

From: Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>
To: adrian(dot)klaver(at)gmail(dot)com
Cc: pgsql-general(at)postgresql(dot)org, Anthony <osm(at)inbox(dot)org>, Rod <cckramer(at)gmail(dot)com>
Subject: Re: Storing many big files in database- should I do it?
Date: 2010-04-29 09:10:48
Message-ID: o2ie94e14cd1004290210y354cbf68s43422977260cecea@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

2010/4/28 Adrian Klaver <adrian(dot)klaver(at)gmail(dot)com>:
> On Tuesday 27 April 2010 5:45:43 pm Anthony wrote:
>> On Tue, Apr 27, 2010 at 5:17 AM, Cédric Villemain <
>>
>> cedric(dot)villemain(dot)debian(at)gmail(dot)com> wrote:
>> > store your files in a filesystem, and keep the path to the file (plus
>> > metadata, acl, etc...) in database.
>>
>> What type of filesystem is good for this?  A filesystem with support for
>> storing tens of thousands of files in a single directory, or should one
>> play the 41/56/34/41563489.ext game?

I'll prefer go with XFS or ext{3-4}. In both case with a path game.
You path game will let you handle the scalability of your uploads. (so
the first increment is the first directory) something like
1/2/3/4/foo.file 2/2/3/4/bar.file etc... You might explore a hash
function or something that split a SHA1(or other) sum of the file to
get the path.

>>
>> Are there any open source systems which handle keeping a filesystem and
>> database in sync for this purpose, or is it a wheel that keeps getting
>> reinvented?
>>
>> I know "store your files in a filesystem" is the best long-term solution.
>> But it's just so much easier to just throw everything in the database.
>
> In the for what it is worth department check out this Wiki:
> http://sourceforge.net/apps/mediawiki/fuse/index.php?title=DatabaseFileSystems

and postgres fuse also :-D

>
> --
> Adrian Klaver
> adrian(dot)klaver(at)gmail(dot)com
>

--
Cédric Villemain

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Cédric Villemain 2010-04-29 09:19:52 Re: [SPAM] Re: Best way to replicate to large number of nodes
Previous Message Magnus Hagander 2010-04-29 09:01:14 Re: Cumulative count (running total) window fn