Re: Hadoop backend?

From: pi song <pi(dot)songs(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Hadoop backend?
Date: 2009-02-24 04:20:33
Message-ID: 1b29507a0902232020l5c40e4c3jea36338fafbd244@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

| I believe there is more than that which would need to be done
nowadays. I seem to recall that the storage manager|
| abstraction has slowly been dedicated/optimized for md over the past 6
years or so. It may even be easier/preferred
| to write a hadoop specific access method depending on what you're
looking for from hadoop.

I think you're very right. What Postgres needs is access method abstraction.
One should be able to plug in access method for SSD or network file systems
if appropriate. I don't talk about MapReduce bit in Hadoop because I think
that's a different story. What you need for MapReduce are 1) data store
which feeds you data and then 2) MapReduce does the query processing. This
has nothing to share with Postgres query processor in common. If you just
want data from Postgres then it should be easier to build postgres data
feeder in Hadoop (which might even already exist).

Pi Song

On Tue, Feb 24, 2009 at 11:24 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> "Jonah H. Harris" <jonah(dot)harris(at)gmail(dot)com> writes:
> > I believe there is more than that which would need to be done nowadays.
> I
> > seem to recall that the storage manager abstraction has slowly been
> > dedicated/optimized for md over the past 6 years or so.
>
> As far as I can tell, the PG storage manager API is at the wrong level
> of abstraction for pretty much everything. These days, everything we do
> is atop the Unix filesystem API, and anything that smgr might have been
> able to do for us is getting handled in kernel filesystem code or device
> drivers. (Back in the eighties, when it was more plausible for PG to do
> direct device access, maybe smgr was good for something; but no more.)
>
> It's interesting to speculate about where we could draw an abstraction
> boundary that would be more useful. I don't think the MySQL guys got it
> right either...
>
> regards, tom lane
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2009-02-24 05:18:25 Re: GIN fast insert
Previous Message Tao Ma 2009-02-24 03:30:47 question about the design consideration for system catalogs