Re: PDF files: to store in database or not

From: David Wall <d(dot)wall(at)computer(dot)org>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: PDF files: to store in database or not
Date: 2016-12-06 21:02:48
Message-ID: 01b11abe-cd1b-301d-ce30-a19827666d1f@computer.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 12/6/16 12:33 PM, Tom Lane wrote:
> John R Pierce <pierce(at)hogranch(dot)com> writes:
>> On 12/6/2016 12:10 PM, Rich Shepard wrote:
>>> I did not realize that a BLOB is not the same as a bytea (page 217
>>> of the
>>> 9.6 PDF manual), and I cannot find BLOB as a postgres data type. Please
>>> point me in the right direction to learn how to store PDFs as BLOBs.
>> indeed BYTEA is postgres's type for storing arbitrary binary objects
>> that are called BLOB in certain other databases.
> Well, there are also "large objects", which aren't really a data type at
> all. If you're storing stuff large enough that you need to write/read
> it in chunks rather than all at once, the large-object APIs are what
> you want.
>
> regards, tom lane

Yeah, we've not used much BYTEA, but use PG's large objects. It also
has a streaming API and you don't have to encode/decode every byte going
in and out of the DB.

In a table, you juse define the "blob_data" column as an OID. Since we
use Java/JDBC, this is handled by ResultSet.getBlob() for a
java.sql.Blob object.

Some complain about DB backups being biggers if the PDFs are inside,
which is true, but this only presumes you don't care about the
filesystem PDFs being backed up separately (and no way to ensure a
reliable DB backup and PDF filesystem backup if the system is active
when doing the backups). You can certainly put the files in a
filesystem and point to them, but you'll likely need some access control
or people will be able to download any/all PDFs in a given folder. In
the DB, you surely will have access control as I presume you don't allow
browser access to the DB <smile>.

Either way, you may want to see if your PDFs compress well or not as
that may save some storage space at the cost of compress/decompress on
accesses.

David

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Joseph Brenner 2016-12-06 21:53:08 Re: Select works only when connected from login postgres
Previous Message Rich Shepard 2016-12-06 20:36:14 Re: PDF files: to store in database or not