I am interested in the MERGE command implementation as my gSoC project

From: Zhai Boxuan <bxzhai(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: I am interested in the MERGE command implementation as my gSoC project
Date: 2010-03-30 07:26:23
Message-ID: b6f1201d1003300026k30c272d3jf6d4785ade549b4b@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

To whom may concern,

My name is Zhai Boxuan, a student from China.

I am now a Master Student of National University of Singapore. And, before I
came to Singapore, I have got another master degree in Wuhan University. In
that period, I focus mainly on implementing a novel Object-oriented database
model on the postgresql 7.4. I am have great interest on the
projects provided by postgres in google Summer Code of this year. I think
the MERGE command in TO DO list is a suitable topic for me.

I have read some infor about the MERGE command, which has not been
implemented yet in Postgres 8. I considered the problem and have a brief
plan for the jobs.

1 We need to update the backend/parser/gram.y for adding the SQL style MERGE
command in the parser. I can do this, since it is similar to what I was
doing in China. One new “MergeStmt” structure should be designed to hold the
transformed command information.
The structure definitely need: one SelectStmt to hold the subquery of the
source table, a list of expressions for the MATCH condition. Yet some other
expression lists are needed for specifying the additional match and/or not
match conditions.
It is relatively easy to implement since we can reuse many components of the
SELECT command.

2. In the Analyze.c file we need to add a function to transform this
MergeStmt into a Query node.
It is necessary to add a new command type for MERGE, which is a plannable
command.
We need to check the semantics correctness of the statement. What I am
thinking about is to combine the target table and the source table as a
whole SELECT query.
If there is no NOT MATCH option, we can generate a normal query node of
something like “SELECT * FROM target, source WHERE match-condition;” or , We
have to do a cross join if we want to handle some NOT MATCH actions, which
will do a query like “SELECT * FROM target, source;”
The benefit is that we can almost fully reuse the rewriter and planner to
transform this generated query as an executor-accepting structure.

3. A plan is need for the query. The planner should accept this new
plannable command. However, as motioned above, the real work will be: do a
traditional query plan on the formatted select query based on the target and
source table. Then pack this plan with a outer planner node, which is
designed for MERGE command specifically.

4. How to execute the query? I am still not very clear. The basic idea is
for each returned tuple of the select query we generated above (the tuple
contains all the attributes in both source and target table) we can test it
with MATCH and/or NOT MATCH conditions, and do corresponding actions base
the testing result.

I believe there are some problems will encounter especially for the
transaction things. And I am also not sure about whether the UPDATE, INSERT
and DELETE operations for previous output tuple will affect the remaining
join processing.
Hope you can help me on improving this rough idea. Or, if you are not
convenient, please kindly forward this letter to who may concern it.

Thank you very much!

Yours Boxuan

--
Best Wishes
Yours Boxuan

--
Best Wishes
Yours Boxuan

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thom Brown 2010-03-30 08:03:29 Re: I am interested in the MERGE command implementation as my gSoC project
Previous Message Stefan Kaltenbrunner 2010-03-30 06:39:09 Re: Parallel pg_dump for 9.1