Skip site navigation (1) Skip section navigation (2)

Re: combined indexes with Gist - planner issues?

From: Hans-Juergen Schoenig -- PostgreSQL <postgres(at)cybertec(dot)at>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Zoltan Boszormenyi <zb(at)cybertec(dot)at>
Subject: Re: combined indexes with Gist - planner issues?
Date: 2009-08-31 14:40:21
Message-ID: 4A9BE0D5.7060106@cybertec.at (view raw or flat)
Thread:
Lists: pgsql-hackers
Martijn van Oosterhout wrote:
> On Mon, Aug 31, 2009 at 04:06:22PM +0200, Hans-Juergen Schoenig -- PostgreSQL wrote:
>   
>> ok, i thought it would be something gist specific i was not aware of.
>> the golden question now is: i am looking for the cheapest products given  
>> a certain text in an insane amount of data.
>> how to do it? other quals which could narrow down the amount of data  
>> would not help.
>>
>> i cannot see an option with regular "weapons" ...
>> maybe you can an idea how to fix core to make it work? maybe there is a  
>> mechanism we could need.
>> we really have to make this work - no matter what it takes.
>> we are willing to put effort into that.
>>     
>
> The way I usually attack such a problem is to think of a data
> structure+algorithm that could produce the output you want. Once you've
> got that it's usually clear how you can make postgres do it and what
> changes would need to be made.
>
> At first glance I don't see any nice data structure specific for your
> problem. But it occurs to me that maybe you could just have a (btree)
> index on the price and just scan in asceding order until you have
> enough records. Expensive if the first record is expensive.
>
> Another possibility is to change your query to use the price in the
> GiST index: execute multiple queries of the form:
>
> ... AND display_price >= 0.01 and display_price < 1;
> ... AND display_price >= 1 and display_price < 10;
>
>   

hello ...

i had a similar idea here but the problem is: prices will pretty much 
depends on products.
to get to some critical example: "book" is a horribly frequent word and 
you will find just too many in a too narrow price range.
using a price index is alone is not a good idea. how many products which 
cost USD 9.95 do you know and how many of them are books? :(
i did some experiments which PL/proxy to scale out a little and i wrote 
some C code to explicitly cache data from the start and so on.
this is all shit, however - it is too much data and I have too many request.
i don't want to fallback to some java-based stuff such as solr. it would 
totally ruin my credibility and the stand postgres has at this customer.
whatever it takes - a PG based solution has to be found and implemented.

my knowledge of how gist works internally is not too extensive. any 
"kickstart" idea would be appreciated.

    many thanks,

       hans

-- 
Cybertec Schoenig & Schoenig GmbH
Reyergasse 9 / 2
A-2700 Wiener Neustadt
Web: www.postgresql-support.de


In response to

Responses

pgsql-hackers by date

Next:From: Andrew DunstanDate: 2009-08-31 14:42:26
Subject: Re: Feature request : add REMAP_SCHEMA-like option to pg_restore
Previous:From: Martijn van OosterhoutDate: 2009-08-31 14:24:41
Subject: Re: combined indexes with Gist - planner issues?

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group