Re: store A LOT of 3-tuples for comparisons

From: Matthew <matthew(at)flymine(dot)org>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: store A LOT of 3-tuples for comparisons
Date: 2008-02-22 15:49:34
Message-ID: Pine.LNX.4.64.0802221546370.20402@aragorn.flymine.org
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-performance

On Fri, 22 Feb 2008, Moritz Onken wrote:
> I need to store a lot of 3-tuples of words (e.g. "he", "can", "drink"), order
> matters!
> The source is about 4 GB of these 3-tuples.
> I need to store them in a table and check whether one of them is already
> stored, and if that's the case to increment a column named "count" (or
> something).

My suggestion would be to use three varchar columns to store the 3-tuples.
You should then create a B-tree index on the three columns together.

> I thought of doing all the inserts without having an index and without doing
> the check whether the row is already there. After that I'd do a "group by"
> and count(*) on that table. Is this a good idea?

That sounds like the fastest way to do it, certainly.

Matthew

--
"We have always been quite clear that Win95 and Win98 are not the systems to
use if you are in a hostile security environment." "We absolutely do recognize
that the Internet is a hostile environment." Paul Leach <paulle(at)microsoft(dot)com>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Susan Russo 2008-02-22 18:34:43 loading same instance of dump to two different servers simultaneously?
Previous Message Moritz Onken 2008-02-22 15:42:29 store A LOT of 3-tuples for comparisons