Re: Speed Question

From: Manfred Koizar <mkoi-pg(at)aon(dot)at>
To: Noah Silverman <noah(at)allresearch(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Speed Question
Date: 2002-12-21 20:02:39
Message-ID: flg90v07p4gs6mb03fd2vhpm5t4enkbn6a@4ax.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Sat, 21 Dec 2002 13:46:05 -0500, Noah Silverman
<noah(at)allresearch(dot)com> wrote:
>Without divulging too many company
>secrets, we create a 32 key profile of an object. We then have to be
>able to search the database to find "similar" objects.

... where "similar" means that the value of each attribute lies within
a small range around the value of the corresponding attribute of the
reference object?

I fear a multicolumn b-tree index is not the optimal solution to this
problem, unless you have some extremely selective attributes you can
put at the start of the index. But then again I doubt that it makes
sense to include even the last attribute (or the last few attributes)
into the index.

>In reality, we
>will probably have 20MM to 30MM rows in our table. I need to very
>quickly find the matching records on a "test" object.

This seems to be a nice case for utilizing bitmaps for index scans.
Thus you would scan several single column indices and combine the
bitmaps before accessing the heap tuples. This has been discussed on
-hackers and I believe it is a todo item.

I don't know, whether GiST or R-Tree could help. Is anybody listening
who knows?

>If you're really curious as to more details, let me know (I don't want
>to bore the group with our specifics)

The group is patient :-)

Servus
Manfred

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Noah Silverman 2002-12-21 20:17:53 Re: Speed Question
Previous Message Noah Silverman 2002-12-21 18:46:05 Re: Speed Question