Re: [GSoC] (Is it OK to choose items without % mark in theToDoList) && (is it an acceptable idea to build index on Flash Disk)

From: mx <mx(dot)cogito(at)gmail(dot)com>
To: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [GSoC] (Is it OK to choose items without % mark in theToDoList) && (is it an acceptable idea to build index on Flash Disk)
Date: 2008-03-25 02:10:16
Message-ID: ded849dd0803241910s1e701c70nc30d6cf4b4da4c29@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thank you for your suggestion!

> The biggest problem with the hash index is currently that there's no
> significant performance over b-tree. If you want to work on hash
> indexes, I would suggest doing benchmarking and looking at ways to
> improve performance, before spending time on making it multi-column
> capable. And missing WAL logging is a big issue as well.

It's a good suggestion! My work is useless if the performance of hash index
is not effective enough.
I'll adopt your suggestion to consider improving hash performance at first.
It's a more challenging and exciting work.

On Tue, Mar 25, 2008 at 12:23 AM, Heikki Linnakangas <
heikki(at)enterprisedb(dot)com> wrote:

> Maybe, hard to tell without more details. What difference does it make
> if the b-tree is on a flash device, as opposed to disk? What's different
> in general when you run on a flash disk?
>
> The "embedded server" idea in the "not wanted" list refers to the idea
> of running PostgreSQL in the same process as the client. If I understood
> you correctly, you're proposing something quite different.
>

OK, I'll explain it in more details.

The atom unit of flash is page(512~2048byte typically).
Page are organized into blocks, typically of 32 or 64 pages.
All read write and write operations happen at page granularity, but erase
operations happen at block granularity.

Flash has a weird characteristic "erase-before-write".You can't just
overwrite a page, You have to erase the whole blocks and then write the
page. So read operation is faster than write operation( about 2~200times
by different device).

It's a big problem when we just run database designed for magnetic
disk.Wejust overwrite a page when we update B-Tree, but it's
not a good
way to update for flash disk. Currently, there are some research results on
this problem. They use a method similar to the Log-structured File
Systems and every node is encoded by many log entries. So, they can
reduce update using log.

In my opinion, we have to change Access Method and some part of Storage
Managers greatly. Is it too hard for a beginner to serve as a GSoC project?

Finally, please make some suggestions, thanks!

--
Have a good day;-)
Best Regards,
Meng Xiao

━━━━━━━━━━━━━━━━━━━━━━━━━
Data and Knowledge Engineering Research Center,CS&T
Harbin Institute of Technology, Harbin, China
Gtalk: mx(dot)cogito(at)gmail(dot)com
MSN: cnEnder(at)live(dot)com
Blog: http://xiaomeng.yo2.cn

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2008-03-25 02:11:31 Re: postgresql in FreeBSD jails: proposal
Previous Message Bruce Momjian 2008-03-25 02:06:36 Re: [pgsql-www] New email list for emergency communications