Re: Storing many big files in database- should I do it?

From: David Wall <d(dot)wall(at)computer(dot)org>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Storing many big files in database- should I do it?
Date: 2010-04-29 16:07:52
Message-ID: 4BD9AED8.5060502@computer.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Things to consider when /not /storing them in the DB:

1) Backups of DB are incomplete without a corresponding backup of the files.

2) No transactional integrity between filesystem and DB, so you will
have to deal with orphans from both INSERT and DELETE (assuming you
don't also update the files).

3) No built in ability for replication, such as WAL shipping

Big downside for the DB is that all large objects appear to be stored
together in pg_catalog.pg_largeobject, which seems axiomatically
troubling that you know you have lots of big data, so you then store
them together, and then worry about running out of 'loids'.

David

On 4/29/2010 2:10 AM, Cédric Villemain wrote:
> 2010/4/28 Adrian Klaver<adrian(dot)klaver(at)gmail(dot)com>:
>
>> On Tuesday 27 April 2010 5:45:43 pm Anthony wrote:
>>
>>> On Tue, Apr 27, 2010 at 5:17 AM, Cédric Villemain<
>>>
>>> cedric(dot)villemain(dot)debian(at)gmail(dot)com> wrote:
>>>
>>>> store your files in a filesystem, and keep the path to the file (plus
>>>> metadata, acl, etc...) in database.
>>>>
>>> What type of filesystem is good for this? A filesystem with support for
>>> storing tens of thousands of files in a single directory, or should one
>>> play the 41/56/34/41563489.ext game?
>>>
> I'll prefer go with XFS or ext{3-4}. In both case with a path game.
> You path game will let you handle the scalability of your uploads. (so
> the first increment is the first directory) something like
> 1/2/3/4/foo.file 2/2/3/4/bar.file etc... You might explore a hash
> function or something that split a SHA1(or other) sum of the file to
> get the path.
>
>
>
>>> Are there any open source systems which handle keeping a filesystem and
>>> database in sync for this purpose, or is it a wheel that keeps getting
>>> reinvented?
>>>
>>> I know "store your files in a filesystem" is the best long-term solution.
>>> But it's just so much easier to just throw everything in the database.
>>>
>> In the for what it is worth department check out this Wiki:
>> http://sourceforge.net/apps/mediawiki/fuse/index.php?title=DatabaseFileSystems
>>
> and postgres fuse also :-D
>
>
>> --
>> Adrian Klaver
>> adrian(dot)klaver(at)gmail(dot)com
>>
>>
>
>
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Beevee 2010-04-29 16:29:49 Select with string that has a lone hyphen yields nothing
Previous Message Guillaume Lelarge 2010-04-29 15:01:18 Re: Start-up script for few clusters: just add water?