Re: Google's Summer of Code ...

From: "Meredith L(dot) Patterson" <mlp(at)thesmartpolitenerd(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Google's Summer of Code ...
Date: 2005-06-01 22:08:12
Message-ID: 429E31CC.8070907@thesmartpolitenerd.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Simon Riggs wrote:
> Is it possible that you could put sufficient of the application into
> PostgreSQL to genericise some features? Stonebraker's Third Wave was
> *all* about putting data intensive operations closer to where the data
> is stored/accessed.

And just like that, a lightbulb goes off in my head.

I'd been reluctant to push the training step inside the engine, because
I couldn't come up with a good way of doing it, but now it seems so
obvious. A ranking support vector machine takes as input a series of
partial orders -- think of it as several "buckets" into which data items
are thrown. Or, if you will, a list of lists of unique identifiers. And
that would be *easy* to pass as part of a query string. I'm envisioning
a syntax like:

ORDER BY SVM linear KEY foo ((1, 2, 3), (4, 5), (6, 7, 8), (9))

So this would be a partial ordering where each number is the key (PK is
column 'foo') of some tuple in a table, and 1, 2, 3 > 4, 5 > 6, 7, 8 > 9
in terms of the user's preference.

Use that (much more human-readable than I had originally envisioned)
input to learn the actual ranking function inside the database, apply
that ranking to the results, and boom -- an ORDER BY clause extrapolated
directly from a partial ranking, with no pesky outside-the-database
learning step.

(Nonlinear kernels have some additional parameters, and tuning them can
be something of a black art, but the syntax can be extended to let
people specify them. Default values would also be necessary.)

I'll continue to think on this, but already this approach strikes me as
a lot more useful to the average user. Thanks, Simon!

Cheers,
Meredith

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Luke Lonergan 2005-06-01 22:18:41 Re: NOLOGGING option, or ?
Previous Message Bruce Momjian 2005-06-01 22:05:13 Re: NOLOGGING option, or ?