Re: Berkeley and CMU classes adopt/extend PostgreSQL

From: "Marc G(dot) Fournier" <scrappy(at)hub(dot)org>
To: Joe Hellerstein <jmh(at)cs(dot)berkeley(dot)edu>
Cc: hannu(at)tm(dot)ee, tgl(at)sss(dot)pgh(dot)pa(dot)us, bruno(at)wolff(dot)to, gsstark(at)mit(dot)edu, pgsql-hackers(at)postgresql(dot)org, Spiros Papadimitriou <spapadim+(at)cs(dot)cmu(dot)edu>, Anastassia Ailamaki <natassa+(at)cs(dot)cmu(dot)edu>, geoff(at)pgsql(dot)com, Sailesh Krishnamurthy <sailesh(at)EECS(dot)Berkeley(dot)EDU>
Subject: Re: Berkeley and CMU classes adopt/extend PostgreSQL
Date: 2003-02-14 23:46:10
Message-ID: 20030214194204.P23108@hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 11 Feb 2003, Joe Hellerstein wrote:

> Hi all:
> I emailed Marc Fournier on this topic some weeks back, but haven't
> heard from him.

And most public apologies for that ... this past month has been a complete
nightmare all around ... we're just finishing up moving our office, and
finally have phone lines again, and hope to have internet again starting
tomorrow ... :(

> 1) We changed the course projects to make the students hack PostgreSQL
> internals, rather than the "minibase" eduware
> 2) We are coordinating the class with a class at CMU being taught by
> Prof. Anastassia ("Natassa") Ailamaki
>
> Our "Homework 2", which is being passed out this week, will ask the
> students to implement a hash-based grouping that spills to disk. I
> understand this topic has been batted about the pgsql-hackers list
> recently. The TAs who've prepared the assignment (Sailesh
> Krishnamurthy at Berkeley and Spiros Papadimitriou at CMU) have also
> implemented a reference solution to assignment. Once we've got the
> students' projects all turned in, we'll be very happy to contribute our
> code back the PostgreSQL project.
>
> I'm hopeful this will lead to many good things:
>
> 1) Each year we can pick another feature to assign in class, and
> contribute back. We'll need to come up with well-scoped engine
> features that exercise concepts from the class -- eventually we'll run
> out of tractable things that PGSQL needs, but not in the next couple
> years I bet.
>
> 2) We'll raise a crop of good students who know Postgres internals.
> Roughly half the Berkeley EECS undergrads take the DB class, and all of
> them will be post-hackers! (Again, I don't know the stats at CMU.)
>
> So consider this a heads up on the hash-agg front, and on the future
> contributions front. I'll follow up with another email on
> PostgreSQL-centered research in our group at Berkeley as well.
>
> Another favor I'd ask is that people on the list be a bit hesitant
> about helping our students with their homework! We would like them to
> do it themselves, more or less :-)
>
> Regards,
> Joe Hellerstein
>
> --
>
> Joseph M. Hellerstein
> Professor, EECS Computer Science Division
> UC Berkeley
> http://www.cs.berkeley.edu/~jmh
>
>
> On Tuesday, February 11, 2003, at 06:54 PM, Sailesh Krishnamurthy
> wrote:
>
> > From: Hannu Krosing <hannu(at)tm(dot)ee>
> > Date: Tue Feb 11, 2003 12:21:26 PM US/Pacific
> > To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
> > Cc: Bruno Wolff III <bruno(at)wolff(dot)to>, Greg Stark <gsstark(at)mit(dot)edu>,
> > pgsql-hackers(at)postgresql(dot)org
> > Subject: Re: [HACKERS] Hash grouping, aggregates
> >
> >
> > Tom Lane kirjutas T, 11.02.2003 kell 18:39:
> >> Bruno Wolff III <bruno(at)wolff(dot)to> writes:
> >>> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >>>> Greg Stark <gsstark(at)mit(dot)edu> writes:
> >>>>> The neat thing is that hash aggregates would allow grouping on
> >>>>> data types that
> >>>>> have = operators but no useful < operator.
> >>>>
> >>>> Hm. Right now I think that would barf on you, because the parser
> >>>> wants
> >>>> to find the '<' operator to label the grouping column with, even if
> >>>> the
> >>>> planner later decides not to use it. It'd take some redesign of the
> >>>> query data structure (specifically SortClause/GroupClause) to avoid
> >>>> that.
> >>
> >>> I think another issue is that for some = operators you still might
> >>> not
> >>> be able to use a hash. I would expect the discussion for hash joins
> >>> in
> >>> http://developer.postgresql.org/docs/postgres/xoper-optimization.html
> >>> would to hash aggregates as well.
> >>
> >> Right, the = operator must be hashable or you're out of luck. But we
> >> could imagine tweaking the parser to allow GROUP BY if it finds a
> >> hashable = operator and no sort operator. The only objection I can
> >> see
> >> to this is that it means the planner *must* use hash aggregation,
> >> which
> >> might be a bad move if there are too many distinct groups.
> >
> > If we run out of sort memory, we can always bail out later, preferrably
> > with a descriptive error message. It is not as elegant as erring out at
> > parse (or even plan/optimise) time, but the result is /almost/ the
> > same.
> >
> > Relying on hash aggregation will become essential if we are ever going
> > to implement the "other" groupings (CUBE, ROLLUP, (), ...), so it would
> > be nice if hash aggregation could also overflow to disk - I suspect
> > that
> > this will still be faster that running an independent scan for each
> > GROUP BY grouping and merging the results.
> >
> > -----
> > Hannu
> >
> >
> > ---------------------------(end of
> > broadcast)---------------------------
> > TIP 1: subscribe and unsubscribe commands go to
> > majordomo(at)postgresql(dot)org
> >
> >
> >
> >
> > --
> > Pip-pip
> > Sailesh
> > http://www.cs.berkeley.edu/~sailesh
> >
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2003-02-15 00:19:13 Re: client_encoding directive is ignored in
Previous Message Bruce Momjian 2003-02-14 22:41:26 Re: location of the configuration files