Re: Large objetcs performance

From: "Merlin Moncure" <mmoncure(at)gmail(dot)com>
To: "Alexandre Vasconcelos" <alex(dot)vasconcelos(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Large objetcs performance
Date: 2007-04-12 13:42:03
Message-ID: b42b73150704120642p239ea6ddkef9de07afc5921f2@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 4/4/07, Alexandre Vasconcelos <alex(dot)vasconcelos(at)gmail(dot)com> wrote:
> We have an application subjected do sign documents and store them
> somewhere. The files size may vary from Kb to Mb. Delelopers are
> arguing about the reasons to store files direcly on operating system
> file system or on the database, as large objects. My boss is
> considering file system storing, because he is concerned about
> integrity, backup/restore corruptions. I'd like to know some reasons
> to convince them to store these files on PosgtreSQL, including
> integrity, and of course, performance. I would like to know the file
> system storing disadvantages as well.

This topic actually gets debated about once a month on the lists :-).
Check the archives, but here is a quick summary:

Storing objects on the file system:
* usually indexed on the database for searching
* faster than database (usually)
* more typical usage pattern
* requires extra engineering if you want to store huge numbers of objects
* requires extra engineering to keep your database in sync. on
postgresql irc someone suggested a clever solution with inotify
* backup can be a pain (even rsync has its limits) -- for really big
systems, look at clustering solutions (drbd for example)
* lots of people will tell you this 'feels' right or wrong -- ignore them :-)
* well traveled path. it can be made to work.

Storing objects on the database:
* slower, but getting faster -- its mostly cpu bound currently
* get very recent cpu. core2 xeons appear to be particularly good at this.
* use bytea, not large objects
* will punish you if your client interface does not communicate with
database in binary
* less engineering in the sense you are not maintaining two separate systems
* forget backing up with pg_dump...go right to pitr (maybe slony?)
* 1gb limit. be aware of high memory requirements
* you get to work with all your data with single interface and
administrate one system -- thats the big payoff.
* less well traveled path. put your r&d cap on and be optimistic but
skeptical. do some tests.

merlin

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Guido Neitzer 2007-04-12 14:08:03 Re: Slow Postgresql server
Previous Message Ron 2007-04-12 13:26:24 Re: Slow Postgresql server