Re: Best practice for file storage?

From: Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
To: Joe Kramer <cckramer(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Best practice for file storage?
Date: 2010-01-31 14:06:44
Message-ID: 4B658E74.9030601@postnewspapers.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 31/01/2010 6:46 PM, Joe Kramer wrote:
> Hi,
>
> I need to store a lot of large files (thousands of 10-100 MB files)
> uploaded through my web application and I find that storing them in
> database as bytea field is not practical for backup purposes.
> My database has full backup performed every 12 hours and backup is
> encrypted and copied to server on another continent. Having all the
> heavy binary data in database will make backups impossible.
>
> So I am thinking of having a table just for metadata and file id
> sequence and storing the file on file system:
>
> CREATE TABLE business_logo
> (
> file_id bigserial NOT NULL,
> file_name varchar,
> file_date timestamp,
> );
>
> Storing file in path composed from serial id, e.g. file with id 2345
> will be stored in
> /webapp/files/2/3/4/5/2345

Consider going a bit easier on your path component splitting. A given
directory only containing 10 subdirs gets expensive in dir traversal
(and is a nightmare to work with for admins); it'll probably be much
better to use 23/45/2345 or even bigger chunks.

This depends a lot on the file system, so testing it is probably best.
ext3 with dir_index shouldn't need much in the way of such manipulation
at all - I regularly store tens of thousands of files in a single
directory without issues.

--
Craig Ringer

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Alex Besogonov 2010-01-31 14:21:16 Re: Ability to 'fork' a running transaction?
Previous Message Craig Ringer 2010-01-31 14:02:33 Re: Ability to 'fork' a running transaction?