Re: Need suggestion

From: Ben Chobot <bench(at)silentmedia(dot)com>
To: Carl von Clausewitz <clausewitz45(at)gmail(dot)com>
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Need suggestion
Date: 2011-06-02 16:32:54
Message-ID: FFCD031D-068D-4C52-B37F-1FE99E1B213A@silentmedia.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Jun 1, 2011, at 1:08 AM, Carl von Clausewitz wrote:

> Hello Everyone,
>
> I got a new project, with 100 user in Europe. In this case, I need to handle production and sales processes an its documentations in PostgreSQL with PHP. The load of the sales process is negligible, but every user produces 2 transaction in the production process, with 10-30 scanned documents (each are 400kb - 800kb), and 30-50 high resolution pictures (each are 3-8 MB), and they wanted to upload it to 'somewhere'. 'Somewhere' could be the server files system, and a link in the PostgreSQL database for the location of the files (with some metadata), or it could be the PostgreSQL database.
>
> My question is that: what is your opinion about to store the scanned documentation and the pictures in the database? This is a huge amount of data (between daily 188MB and 800MB data, average year is about 1 TB data), but is must be searchable, and any document must be retrieved within 1 hour. Every documentations must be stored for up to 5 years... It means the database could be about 6-7 TB large after 5 years, and then we can start to archive documents. Any other data size is negligible.
>
> If you suggest, to store all of the data in PostgreSQL, what is your recommendation about table, index structure, clustering, archiving?

So, you're mostly storing ~1TB of images/year? That doesn't seem so bad. How will the documents be searched? Will their contents be OCR'd out and put into a full text search? How many searches will be going on?

If you're asking whether or not it makes sense to store 7TB of images in the database, as opposed to storing links to those images and keeping the images themselves on a normal filesystem, there's no clear answer. Check the archives for pros and cons of each method.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Merlin Moncure 2011-06-02 16:41:14 Re: Postgres 8.3.5 - ECPG and the use of descriptors and cursors in multi-threaded programs
Previous Message Vick Khera 2011-06-02 16:06:16 Re: Access to postgres conversion