Re: PostgreSQL - 'SKYLINE OF' clause added!

From: Nikita <nikita(dot)p(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Shane Ambler" <pgsql(at)sheeky(dot)biz>, "Josh Berkus" <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org, "Gavin Sherry" <swm(at)alcove(dot)com(dot)au>, "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Chris Browne" <cbbrowne(at)acm(dot)org>
Subject: Re: PostgreSQL - 'SKYLINE OF' clause added!
Date: 2007-03-08 19:05:46
Message-ID: a6a6f4f90703081105wa68dfcey736708f41f3d956c@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Few things from our side:

1. 'Skyline Of' is a new operator proposed in ICDE 2003, one of the topmost
conferences of Data Engineering. Skyline operation is a hot area of research
in query processing. Many of the database community people do know about
this operator, and it is fast catching the attention.

2. The skyline operation is very useful in data analysis. Suppose, if we
have a cricket database, and we want to find the bowlers who have taken
maximum wickets in minimum overs, we can issue an easy-to-write query using
'Skyline of' syntax as follows:

Select * from Player_Match Skyline Of overs_bowled min, wickets_taken max;

This query gives 25 interesting tuples (result set) out of 24750 tuples in
0.0509 seconds. The same result is obtained in 0.8228 seconds if the
following equivalent nested-query is issued:

select * from Player_Match p1 where not exists ( select * from Player_Match
p2 where p2.overs_bowled <= p1.overs_bowled and p2.wickets_taken >=
p1.wickets_taken and (p2.overs_bowled < p1.overs_bowled or p2.wickets_taken>
p1.wickets_taken))

Note that the above time is the time elapsed between issuing a query and
obtaining the result set.
As can be seen, the above query looks pretty cumbersome to write and is
inefficient too. So, which query will the user prefer? As the number of
dimensions increases, writing a nested-query will become a hedious task.
Btw, how can such a query be written using aggregate function syntax??

3. As far as optimizing the Skyline is concerned, it is still a research
problem since it requires estimating the cardinality of the skyline result
set.

4. Until and unless this operator is implemented in a popular database
system, how can a user ever get to know about it and hence appreciate its
usefulness?

Btw, it was our B.Tech final year project, and not a term project :-)

Regards.

On 3/8/07, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Shane Ambler <pgsql(at)Sheeky(dot)Biz> writes:
> > Tom Lane wrote:
> >> Well, whether it's horrible or not is in the eye of the beholder, but
> >> this is certainly a non-standard syntax extension.
>
> > Being non-standard should not be the only reason to reject a worthwhile
> > feature.
>
> No, but being non-standard is certainly an indicator that the feature
> may not be of widespread interest --- if it were, the SQL committee
> would've gotten around to including it; seems they've managed to include
> everything but the kitchen sink already. Add to that the complete lack
> of any previous demand for the feature, and you have to wonder where the
> market is.
>
> > The fact that several
> > different groups have been mentioned to be working on this feature would
> > indicate that it is worth considering.
>
> It looks to me more like someone published a paper that caught the
> attention of a few profs looking for term projects for their students.
>
> Now maybe it really is the best idea since sliced bread and will be seen
> in the next SQL spec edition, but color me skeptical. It seems to me
> to be a very narrow-usage extension, as opposed to (eg) multi-input
> aggregates or WITH/RECURSIVE, which provide general mechanisms applicable
> to a multitude of problems. Now even so it would be fine if the
> implementation were similarly narrow in scope, but the published
> description of the patch mentions a large chunk of additional executor
> mechanisms. If we're going to be adding as much code as that, I'd like
> to see a wider scope of usage for it.
>
> Basically, this patch isn't sounding like it has a reasonable
> bang-to-the-buck ratio ...
>
> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend
>

--
Pride sullies the noblest character

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2007-03-08 19:16:36 Re: Stream bitmaps
Previous Message Jim C. Nasby 2007-03-08 18:54:57 Re: Auto creation of Partitions