Skip site navigation (1) Skip section navigation (2)

Re: Storing many big files in database- should I do it?

From: Adrian von Bidder <avbidder(at)fortytwo(dot)ch>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Storing many big files in database- should I do it?
Date: 2010-04-27 10:43:28
Message-ID: 201004271243.35493@fortytwo.ch (view raw or flat)
Thread:
Lists: pgsql-general
On Tuesday 27 April 2010 11.17:42 Cédric Villemain wrote:
> > Anyone had this kind of design problem and how did you solve it?
> 
> store your files in a filesystem, and keep the path to the file (plus
> metadata, acl, etc...) in database.

... and be careful that db and file storage do not go out of sync.

But if files are ever only added and possibly removed (but never changed), 
this is not too hard:

 * be sure to commit db transaction only after file has been written to disk 
(use fsync or similar to be sure!)  (For file deletions: first delete db 
metadata, then delete the file.)
 * be sure to detect failed writes and abort the db transaction or otherwise 
properly handle errors while storing the file.
 * occasionally run a clean-up to remove files that were written to 
filesystem where the db metadata was not stored.  Should be a rare case but 
it probably will happen.

PostgreSQL support 2PC (PREPARE and then COMMIT as separate steps); you may 
want to use this (PREPARE database transaction, then do filesystem 
operations.  If filessystem operation fails, you cann ROLLBACK the db 
connection, otherwise COMMIT.)  That way, you don't lose transactional 
semantics.

Backup requires some more thought.  I guess you could use some kind of 
volume management to get filesysstem snapshots, but you have to be sure the 
fs snapshot reflects the point in time when the database backup was made.  
Depending on load / availability requirements you may get away with stopping 
data modification at the application level for a few seconds until the db 
backup has started and the filesystem snapshot has been created.

cheers
-- vbi

-- 
featured product: PostgreSQL - http://postgresql.org

In response to

pgsql-general by date

Next:From: Raymond O'DonnellDate: 2010-04-27 10:49:54
Subject: Re: pg_hba.conf
Previous:From: jkunkelDate: 2010-04-27 10:42:59
Subject: pg_hba.conf

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group