Re: Cube extension improvement, GSoC

From: Stas Kelvich <stanconn(at)gmail(dot)com>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Cube extension improvement, GSoC
Date: 2013-05-04 19:19:23
Message-ID: 7E672D6F-DF16-4747-9DD6-00F28CF81F1F@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello.

> * Learning cube extension to store dimensions with different data types. Such index would be good alternative to compound key B-Tree multi-index (suitable for diapason queries and data ordering).
>
> You mean, a cube containing something else than floats? I don't think we want to explode the number of datatypes by doing that, casting between them could be problematic.
>
> At least option for having float4 cube instead of foat8 cube seems reasonable for me, because of space economy payed by less accuracy.

The main idea was to reduce the size of the index when it is possible. This can be very important when we have many dimensions with low number of different elements.

> But I wonder if you could add cube-like operators for arrays. We already have support for arrays of any datatype, and any number of dimensions. That seems awfully lot like what the cube extension does.

All cube stuff assumes fixed number of dimensions and it is not very useful in arrays where we easily can change number of elements. One can define index on expression over array, i.e. CREATE INDEX ON table USING GIST(cube(array[a,b,c])) and cube-like operations will work. But it is only when we have fixed-size array. And again if array elements is smallints then such cube uses 8 times more space then it's actually needed — four times when converting to float8 and two times when storing coincident cube bounds.

> I think we have at least 3 data types more or less similar to cube.
> 1) array of ranges
> 2) range of arrays
> 3) 2d arrays
> Semantically cube is most close to array or ranges. However array of ranges have huge storage overhead.
> Also we can declare cube as domain on 2d arrays and declare operations of that domain.

But what we should do when arrays in different records have different numbers of element?

Stas Kelvich.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2013-05-04 19:30:48 Re: matview niceties: pick any two of these three
Previous Message Alexander Korotkov 2013-05-04 18:27:43 Terminology issue: suffix tree