Re: SQL/MED estimated time of arrival?

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Shigeru HANADA <hanada(at)metrosystems(dot)co(dot)jp>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>, Eric Davies <eric(at)barrodale(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: SQL/MED estimated time of arrival?
Date: 2010-11-22 02:16:05
Message-ID: AANLkTi=pxZ0NHPhDDMJGnann7DehoF1X_UZ8Lh3qqoMW@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Nov 19, 2010 at 9:55 AM, Shigeru HANADA
<hanada(at)metrosystems(dot)co(dot)jp> wrote:
> [ new SQL/MED patch ]

I can't help noticing that this patch adds 8,982 lines and removes
408, making it far larger any other patch I've ever seen on this list.
And what that means is that committing all of this in one go is going
to be very, very difficult. Now, on the plus side, as 9000+ line
patches go, this one looks pretty well-written, at least after ten
seconds of looking at it, which is great as far as it goes, but the
sheer size is still going to make it just about impossible for anyone
to review it effectively and have real confidence that the whole thing
is commit-quality. To have a chance of getting a significant portion
of this into PostgreSQL 9.1, it really needs to be broken up into
INDEPENDENTLY COMMITTABLE SUB-PATCHES.

The key words here are "independently committable". Breaking up a
patch into sub-patches by directory, for example, is completely
useless - we're not, for example, going to commit the code first and
the docs separately. Let me say that again - the ONLY useful way of
breaking up a patch is to divide it into pieces such that EACH piece,
by itself, would represent a credible commit. Each piece should be
posted to a separate thread and a separate discussion should be had
about the merits and demerits of each one. Each should have a
separate CommitFest entry and, ideally, a separate reviewer. Of
course, it may not be possible to fully evaluate a given patch without
looking at the other ones, but the extent to which this is necessary
should be minimized; otherwise you haven't really broken it up
usefully.

Ultimately, we probably want and need to get this patch down to chunks
of less than 2000 lines each. But for starters, it looks quite simple
to break this into three chunks: one for the PostgreSQL FDW, one for
the CSV FDW, and one for the core functionality. I think that the CSV
FDW, like the PG FDW, should be a loadable module. (I wonder if it
would be more sensible to name all the FDWs as "fdw_foo" rather than
"foo_fdw", so that they alphabetize together, but I believe that Tom
has shot down similar suggestions in the past, so maybe it's not a
good idea after all.) So let's do that and then see if we can find
anything that we can either simplify (so it takes fewer lines of code)
or pull out and commit separately (because, for example, it's some
kind of refactoring that is a good idea independently of this patch).

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Kupershmidt 2010-11-22 02:44:09 Re: psql: Add \dL to show languages
Previous Message Robert Haas 2010-11-22 01:46:29 Re: Instrument checkpoint sync calls