[GSoC] Clustering in MADlib - status update

From: Maxence Ahlouche <maxence(dot)ahlouche(at)gmail(dot)com>
To: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Andreas Scherbaum <ascherbaum(at)gopivotal(dot)com>, Caleb Welton <cwelton(at)gopivotal(dot)com>, Hai Qian <hqian(at)gopivotal(dot)com>, Sujit Philip <sphilip(at)gopivotal(dot)com>, Marc Pantel <Marc(dot)Pantel(at)enseeiht(dot)fr>, "devel(at)madlib(dot)net" <devel(at)madlib(dot)net>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: [GSoC] Clustering in MADlib - status update
Date: 2014-05-25 17:17:54
Message-ID: CAJeaomUZfGXKyvUB4-6yxK5m+dVMLd+w+5DEm5MbYf2kErB0XA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Here is my first report. You can also find it on my Gitlab [0].
Week 1 - 2014/05/25

For this first week, I have written a test script that generates some
simple datasets, and produces an image containing the output of the MADlib
clustering algorithms.

This script can be called like this:

./clustering_test.py new ds0 -n 8 # generates a dataset called "ds0"
with 8 clusters
./clustering_test.py query ds0 -o output.png # outputs the result of
the clustering algorithms applied to ds0 in output.png

See ./clustering_test.py -h for all the available options.

An example of output can be found here
[1].<http://git.viod.eu/viod/gsoc_2014/blob/master/clustering_test/example_dataset.png>

Of course, I will keep improving this test script, as it is still far from
perfect; but for now, it does approximately what I want.

For next week, I'll start working on the implementation of k-medoids in
MADlib. As a reminder, according to the timeline I suggested for the
project, this step must be done on May 30. Depending on the problems I will
face (mostly lack of knowledge of the codebase, I guess), this might not be
finished on time, but it should be done a few days later (by the end of
next week, hopefully).

Attached is the patch containing everything I have done this week, though
the git log might be more convenient to read.

Regards,

Maxence A.

[0] http://git.viod.eu/viod/gsoc_2014/blob/master/reports.rst
[1]
http://git.viod.eu/viod/gsoc_2014/blob/master/clustering_test/example_dataset.png

--
Maxence Ahlouche
06 06 66 97 00

Attachment Content-Type Size
week1.patch text/x-patch 25.5 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2014-05-25 17:44:47 Re: pg_upgrade fails: Mismatch of relation OID in database 8.4 -> 9.3
Previous Message Andres Freund 2014-05-25 17:10:00 Re: Sending out a request for more buildfarm animals?