Re: GSoC 2014 proposal

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Иван Парфилов <iparfilov(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: GSoC 2014 proposal
Date: 2014-04-01 10:41:17
Message-ID: 533A97CD.8030508@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 03/30/2014 11:50 PM, Иван Парфилов wrote:
> * Quantifiable results*
>
> Adding support of BIRCH algorithm for data type cube

Aside from the details of *how* that would work, the other question is:

Do we want this in contrib/cube? There are currently no clustering
functions, or any other statistical functions or similar, in
contrib/cube. Just basic contains/contained/overlaps operators. And
B-tree comparison operators which are pretty useless for cube.

Do we want to start adding such features to cube, in contrib? Or should
that live outside the PostgreSQL source tree, in an separate extension,
so that it could live on its own release schedule, etc. If BIRCH goes
into contrib/cube, that's an invitation to add all kinds of functions to it.

We received another GSoC application to add another clustering algorithm
to the MADlib project. MADlib is an extension to PostgreSQL with a lot
of different statistical tools, so MADlib would be a natural home for
BIRCH too. But if it requires backend changes (ie. changes to GiST),
then that needs to be discussed on pgsql-hackers, and it probably would
be better to do a reference implementation in contrib/cube. MADlib could
later copy it from there.

- Heikki

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2014-04-01 11:04:03 Re: using arrays within structure in ECPG
Previous Message Heikki Linnakangas 2014-04-01 10:23:05 Re: GSoC 2014 proposal