Re: [SQL] Re: 500KB PDF files for postgresql8.3, which is the most efficient way?

From: Samuel Gendler <sgendler(at)ideasculptor(dot)com>
To: emilu(at)encs(dot)concordia(dot)ca
Cc: Jasen Betts <jasen(at)xnet(dot)co(dot)nz>, pgsql-general(at)postgresql(dot)org, PostgreSQL SQL List <pgsql-sql(at)postgresql(dot)org>
Subject: Re: [SQL] Re: 500KB PDF files for postgresql8.3, which is the most efficient way?
Date: 2011-05-27 19:28:22
Message-ID: BANLkTi=safG7dp3U3LGQPc95j6Yh75KmHQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-sql

On Fri, May 27, 2011 at 9:15 AM, Emi Lu <emilu(at)encs(dot)concordia(dot)ca> wrote:

>
> Solution:
> ====================
> (1) Save pdfs to file system, only point file name in psql8.3
>
> (2) Save oids of pdfs into table
>
> (3) Save pdf files as bytea column in psql8.3
>
>
> Pros and cons for (1), (2), (3), which is the most efficient way?
>

You asked about the "most efficient" but you didn't tell us what you are
doing with the pdf files. It really doesn't matter how you store it if you
are simply inserting a 500KB object 30 times per year - then it becomes all
about convenience. I'd probably rule out solution #1 in that case purely
from a maintenance perspective. Storing the PDFs in the db eliminates the
maintenance work of ensuring that the files are archived along with the db
backups, that file paths stay correct whenever a new db host is provisioned,
and storing the files in the db provides transaction-safe file storage. If,
on the other hand, you are selecting/updating the rows in question several
hundred times per second or more, then you may well want to put some thought
into efficiency and, assuming some kind of web access, providing your
webserver/caching layer with direct access to the files on the filesystem
for service efficiency. But if you are only occasionally accessing the
files in question, there's little reason not to put them in the db.

If mybatis can't load the object that references the file without also
loading the entire file from the db - and you are using that object for
other things that will require frequent loading/storing - then you probably
don't want the files in the db. If it can load the object without loading
the binary data, by lazily loading that column only when requested, then it
doesn't matter. Using BLOBs guarantees that you can access the entire row
without loading the binary data if mybatis exposes the blob separately from
the rest of the object, but mybatis may be capable of issuing a select
without that column and then grabbing that column as needed in order to
simulate that in the case of a bytea column, anyway. Internally, a large
bytea column is treated similarly to a blob, with any data over a certain
size not stored in-line with the rest of the row for efficiency reasons.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message preetika tyagi 2011-05-27 19:36:35 Shared Buffer Size
Previous Message Tarlika Elisabeth Schmitz 2011-05-27 19:01:35 Re: trigger - dynamic WHERE clause

Browse pgsql-sql by date

  From Date Subject
Next Message Kevin Crain 2011-05-27 19:28:51 Order of evaluation in triggers for checks on inherited table partitions
Previous Message Emi Lu 2011-05-27 16:15:00 Re: 500KB PDF files for postgresql8.3, which is the most efficient way?