Quick Links

Re: Hadoop backend?

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	pi(dot)songs(at)gmail(dot)com
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Hadoop backend?
Date:	2009-02-22 20:47:15
Message-ID:	603c8f070902221247m4b7d412o2945de7e617ebf15@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Sat, Feb 21, 2009 at 9:37 PM, pi song <pi(dot)songs(at)gmail(dot)com> wrote:
> 1) Hadoop file system is very optimized for mostly read operation
> 2) As of a few months ago, hdfs doesn't support file appending.
> There might be a bit of impedance to make them go together.
> However, I think it should a very good initiative to come up with ideas to
> be able to run postgres on distributed file system (doesn't have to be
> specific hadoop).

In theory, I think you could make postgres work on any type of
underlying storage you like by writing a second smgr implementation
that would exist alongside md.c. The fly in the ointment is that
you'd need a more sophisticated implementation of this line of code,
from smgropen:

reln->smgr_which = 0; /* we only have md.c at present */

Logically, it seems like the choice of smgr should track with the
notion of a tablespace. IOW, you might to have one tablespace that is
stored on a magnetic disk (md.c) and another that is stored on your
hypothetical distributed filesystem (hypodfs.c). I'm not sure how
hard this would be to implement, but I don't think smgropen() is in a
position to do syscache lookups, so probably not that easy.

...Robert

In response to

Re: Hadoop backend? at 2009-02-22 02:37:29 from pi song

Responses

Re: Hadoop backend? at 2009-02-22 22:18:52 from pi song
Re: Hadoop backend? at 2009-02-23 23:03:17 from Jonah H. Harris

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	pi song	2009-02-22 22:18:52	Re: Hadoop backend?
Previous Message	Pavel Stehule	2009-02-22 18:31:16	Re: some broken on pg_stat_user_functions