Re: column type for pdf file

From: "Ross J(dot) Reedstrom" <reedstrm(at)rice(dot)edu>
To: Eric McKeeth <eldin00(at)gmail(dot)com>
Cc: emilu(at)encs(dot)concordia(dot)ca, pgsql-sql(at)postgresql(dot)org
Subject: Re: column type for pdf file
Date: 2011-05-26 18:07:56
Message-ID: 20110526180756.GC1938@rice.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

On Wed, May 18, 2011 at 05:06:36PM -0600, Eric McKeeth wrote:
> On Wed, May 18, 2011 at 2:20 PM, Emi Lu <emilu(at)encs(dot)concordia(dot)ca> wrote:
>
> > Hello,
> >
> > To save pdf files into postgresql8.3, what is the best column type?
> >
> > bytea, blob, etc?
> >
> > Thank you,
> > Emi
> >
>
> Everyone else has pointed out reasons for not doing this, and I agree with
> them that in the large majority of cases just storing a reference to a file
> stored outside the database is preferable. However, to answer the question
> you asked, my rule of thumb is that if you need to store binary data in the
> database is to use a bytea column, unless you need the random access
> capabilities that the large object interface provides. A bytea column is
> typically easier to use, and has proper transactional behavior, enforcement
> of referential integrity, etc.
>

I'm with Eric on this one: for smaller use cases, the convenience of bytea
in the db is nice. As to random access, I wrote a client-side wrapper
for our middleware that implements a file iterator interface for python
on top of substr(bytea,position,blocksize). I was sort of surprised at
how well it performed. We're using it in production right now.

I actually store files in a leaf table w/ and id and hash, with
filenames in a separate linking table, so I'm even getting data
deduplication (all the rage in biz these days) for free.

Ross
--
Ross Reedstrom, Ph.D. reedstrm(at)rice(dot)edu
Systems Engineer & Admin, Research Scientist phone: 713-348-6166
Connexions http://cnx.org fax: 713-348-3665
Rice University MS-375, Houston, TX 77005
GPG Key fingerprint = F023 82C8 9B0E 2CC6 0D8E F888 D3AE 810E 88F0 BEDE

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Gauthier, Dave 2011-05-26 20:23:50 copy record?
Previous Message Seb 2011-05-25 23:08:10 Re: enum data type vs table