Re: copy command and blobs

From: Greg Spiegelberg <gspiegelberg(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Madhu Ramachandran <iammadhu(at)gmail(dot)com>, Mladen Gogala <mladen(dot)gogala(at)vmsinfo(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: copy command and blobs
Date: 2011-02-06 18:15:18
Message-ID: AANLkTimfbcCickrZfiWLLoDqiJGKquFYwki2f_FPPK53@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Sat, Jan 22, 2011 at 8:41 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Fri, Jan 21, 2011 at 5:10 PM, Madhu Ramachandran <iammadhu(at)gmail(dot)com>
> wrote:
> > i was looking at
> > http://www.postgresql.org/files/documentation/books/aw_pgsql/node96.html
> > when they talk about using OID type to store large blobs (in my case .jpg
> > files )
>
> It's probably worth noting that that document is 9 years old. It
> might be worth reading something a little more up-to-date. Perhaps:
>
> http://www.postgresql.org/docs/current/static/largeobjects.html
>
>
A bit late to respond but better than never!

As of my latest testing in 8.3, I've found that the lo_* functions while
adequate are a bit slow. Our implemented alternative that leverages
pg_read_file() is significantly faster. I believe it is because
pg_read_file() tells the database to go straight to the file system rather
than through the client connection. From memory, I seem to recall this
being about 20% faster than the lo_* option or simple INSERTs.

The downside to pg_read_file() is that the file must be 1) on the same
system as the database and 2) must be under the $PGDATA directory. We opted
to create a directory $PGDATA/public with proper system-side permissions but
open enough to allow the database owner to read the files.

For example,
postgres=# select pg_read_file('public/a_file', 0,
(pg_stat_file('postgresql.conf')).size);

We use this method in conjunction with additional checks to store files in
tables governed by the MD5 hash of the file to prevent duplication.

HTH.
Greg

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Linos 2011-02-06 18:16:23 general hardware advice
Previous Message Scott Marlowe 2011-02-06 15:23:12 Re: Really really slow select count(*)