Re: [HACKERS] Index greater than 8k

From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Darcy Buskermolen <darcyb(at)commandprompt(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, PgSQL General <pgsql-general(at)postgresql(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Index greater than 8k
Date: 2006-11-01 04:55:04
Message-ID: 454828A8.3020105@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers


>> We are not storing bytea, a customer is. We are trying to work around
>> customer requirements. The data that is being stored is not always text,
>> sometimes it is binary (a flash file or jpeg). We are using escaped text
>> to be able to search the string contents of that file .
>
> Hmm, have you tried to create a functional trigram index on the
> equivalent of "strings(bytea_column)" or something like that?

I did consider that. I wonder what size we are going to deal with
though. Part of the problem is that some of the data we are dealing with
is quite large.

>
> I imagine strings(bytea) would be a function that returns the
> concatenation of all pure (7 bit) ASCII strings in the byte sequence.
>
> On the other hand, based on Teodor's comment on pg_trgm, maybe this
> won't be possible at all.
>> Yes we do (and can) expect to find text among the bytes. We have
>> searches running, we are just running into the maximum size issues for
>> certain rows.
>
> Do you mean you actually find stuff based on text attributes in JPEG
> images and the like? I thought those were compressed ...

Well a jpeg is probably a bad example, but yes they do search jpeg, I am
guessing mostly for header information. A better example would be
postscript files, flash files and of course large amounts of text + Html.

Sincerely,

Joshua D. Drake

--

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2006-11-01 05:01:23 Re: [HACKERS] Index greater than 8k
Previous Message Tom Lane 2006-11-01 04:47:56 Re: Encoding, Unicode, locales, etc.

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2006-11-01 05:01:23 Re: [HACKERS] Index greater than 8k
Previous Message Tom Lane 2006-11-01 04:51:40 Re: Extended protocol logging