Re: Creating large database of MD5 hash values

From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Jon Stewart <jonathan(dot)l(dot)stewart(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Creating large database of MD5 hash values
Date: 2008-04-11 14:25:44
Message-ID: 20080411142543.GA6442@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Jon Stewart escribió:
> Hello,
>
> I am creating a large database of MD5 hash values. I am a relative
> newb with PostgreSQL (or any database for that matter). The schema and
> operation will be quite simple -- only a few tables, probably no
> stored procedures -- but I may easily end up with several hundred
> million rows of hash values, possible even get into the billions. The
> hash values will be organized into logical sets, with a many-many
> relationship. I have some questions before I set out on this endeavor,
> however, and would appreciate any and all feedback, including SWAGs,
> WAGs, and outright lies. :-) I am trying to batch up operations as
> much as possible, so I will largely be doing comparisons of whole
> sets, with bulk COPY importing. I hope to avoid single hash value
> lookup as much as possible.

If MD5 values will be your primary data and you'll be storing millions
of them, it would be wise to create your own datatype and operators with
the most compact and efficient representation possible.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Vivek Khera 2008-04-11 14:41:09 Re: recommendations for web/db connection pooling or DBD::Gofer reviews
Previous Message Florian Weimer 2008-04-11 14:05:00 Re: Creating large database of MD5 hash values