Re: BLOB's bypassing the OS Filesystem for better Image

From: PFC <lists(at)boutiquenumerique(dot)com>
To: apoc9009(at)yahoo(dot)de, pgsql-performance(at)postgresql(dot)org
Subject: Re: BLOB's bypassing the OS Filesystem for better Image
Date: 2005-05-01 23:22:49
Message-ID: op.sp4dwbd2th1vuj@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


My laptop reads an entire compiled linux kernel (23000 files totalling
250 MBytes) in about 1.5 seconds if they're in cache. It's about 15.000
files/second. You think it's slow ? If you want to read them in random
order, you'll probably use something else than a laptop drive, but you get
the idea.

Filesystem is reiser4.

If you use ext2, you'll have a problem with many files in the same
directory because I believe it uses a linear search, hence time
proportional to the number of files (ouch). I once tried to put a million
1-kbyte files in a directory ; it was with reiserfs 3, and it didn't seem
to feel anything close to molested. I believe it took some 10 minutes, but
it was two years ago so I don't remember very well. NTFS took a day, that
I do remember ! By curiosity I tried to stuff 1 million 1KB files in a
directory on my laptop right now, It took a bit less than two minutes.

On Tue, 26 Apr 2005 11:34:45 +0200, apoc9009(at)yahoo(dot)de <apoc9009(at)yahoo(dot)de>
wrote:

>
>> Which filesystems? I know ext2 used to have issues with many-thousands
>> of files in one directory, but that was a directory scanning issue
>> rather than file reading.
>
> From my Point of view i think it is better to let one Process do the
> operation to an Postgres Cluster Filestructure as
> if i bypass it with a second process.
>
> For example:
> A User loads up some JPEG Images over HTTP.
>
> a) (Filesystem)
> On Filesystem it would be written in a File with a random generated
> Filename (timestamp or what ever)
> (the Directory Expands and over a Million Fileobjects with will be
> archived, written, replaced, e.t.c)
>
> b) (Database)
> The JPEG Image Information will be stored into a BLOB as Part of a
> special Table, where is linked
> wit the custid of the primary Usertable.
>
> From my Point of view is any outside Process (must be created, forked,
> Memory allocated, e.t.c)
> a bad choice. I think it is generall better to Support the Postmaster in
> all Ways and do some
> Hardware RAID Configurations.
>
>>> My Question:
>>> Can i speedup my Webapplication if i store my JPEG Images with small
>>> sizes inside my PostgreSQL Database (on verry large Databasis over 1
>>> GByte
>>> and above without Images at this time!)
>>
>>
>> No. Otherwise the filesystem people would build their filesystems on
>> top of PostgreSQL not the other way around. Of course, if you want
>> image updates to be part of a database transaction, then it might be
>> worth storing them in the database.
>
> Hmm, ORACLE is going the other Way. All File Objects can be stored into
> the Database if the DB
> has the IFS Option (Database Filesystem and Fileserver insinde the
> Database).
>
>
>>
>>> I hope some Peoples can give me a Tip or Hint where in can
>>> some usefull Information about it!
>>
>> Look into having a separate server (process or actual hardware) to
>> handle requests for static text and images. Keep the Java server for
>> actually processing
>
>
> Thanks
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Tim Terlegård 2005-05-02 13:52:31 batch inserts are "slow"
Previous Message Josh Berkus 2005-04-29 05:22:51 Re: Distinct-Sampling (Gibbons paper) for Postgres