Martijn van Oosterhout wrote:
> On Mon, Aug 31, 2009 at 04:06:22PM +0200, Hans-Juergen Schoenig -- PostgreSQL wrote:
>> ok, i thought it would be something gist specific i was not aware of.
>> the golden question now is: i am looking for the cheapest products given
>> a certain text in an insane amount of data.
>> how to do it? other quals which could narrow down the amount of data
>> would not help.
>> i cannot see an option with regular "weapons" ...
>> maybe you can an idea how to fix core to make it work? maybe there is a
>> mechanism we could need.
>> we really have to make this work - no matter what it takes.
>> we are willing to put effort into that.
> The way I usually attack such a problem is to think of a data
> structure+algorithm that could produce the output you want. Once you've
> got that it's usually clear how you can make postgres do it and what
> changes would need to be made.
> At first glance I don't see any nice data structure specific for your
> problem. But it occurs to me that maybe you could just have a (btree)
> index on the price and just scan in asceding order until you have
> enough records. Expensive if the first record is expensive.
> Another possibility is to change your query to use the price in the
> GiST index: execute multiple queries of the form:
> ... AND display_price >= 0.01 and display_price < 1;
> ... AND display_price >= 1 and display_price < 10;
i had a similar idea here but the problem is: prices will pretty much
depends on products.
to get to some critical example: "book" is a horribly frequent word and
you will find just too many in a too narrow price range.
using a price index is alone is not a good idea. how many products which
cost USD 9.95 do you know and how many of them are books? :(
i did some experiments which PL/proxy to scale out a little and i wrote
some C code to explicitly cache data from the start and so on.
this is all shit, however - it is too much data and I have too many request.
i don't want to fallback to some java-based stuff such as solr. it would
totally ruin my credibility and the stand postgres has at this customer.
whatever it takes - a PG based solution has to be found and implemented.
my knowledge of how gist works internally is not too extensive. any
"kickstart" idea would be appreciated.
Cybertec Schoenig & Schoenig GmbH
Reyergasse 9 / 2
A-2700 Wiener Neustadt
In response to
pgsql-hackers by date
|Next:||From: Andrew Dunstan||Date: 2009-08-31 14:42:26|
|Subject: Re: Feature request : add REMAP_SCHEMA-like option to
|Previous:||From: Martijn van Oosterhout||Date: 2009-08-31 14:24:41|
|Subject: Re: combined indexes with Gist - planner issues?|