Re: Review Report: propose to include 3 new functions into intarray and intagg

From: "Dmitry Koterov" <dmitry(at)koterov(dot)ru>
To: "Markus Wanner" <markus(at)bluegap(dot)ch>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Review Report: propose to include 3 new functions into intarray and intagg
Date: 2008-09-07 12:24:40
Message-ID: d7df81620809070524u7e2597f1w8970440abffff4f6@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

OK, thank you for your review.

I'll correct everything and send a patch in a couple of days. Are you
completely sure that this patch will be included? If not, seems the work of
the patch standartization has much lower priority, and I will not hurry so
much.

But, what about intarray patch? Does somebody plan to review it? I'd prefer
to include it too. If you approve, I'll correct the code style in intarray
contrib patch too.

On Sat, Sep 6, 2008 at 3:34 PM, Markus Wanner <markus(at)bluegap(dot)ch> wrote:

> Hi,
>
> this is my first "official" review. I've tried to follow the "Review a
> patch" guidelines from the wiki - thanks Simon, that was pretty helpful.
>
> This review covers only the intagg additions.
>
> Dmitry Koterov wrote:
>
>> Here are these functions with detailed documentation:
>> http://en.dklab.ru/lib/dklab_postgresql_patch/
>>
>
> Submission review: we generally prefer having patches archived on our
> mailing lists, so please just send future patches or revisions of this
> patch to our lists (I prefer -hackers, but probably -patches is still
> the official one). The documentation should go into a README file of the
> contrib module or some such. Tests are missing, but that's the case for
> the intagg contrib module anyway. The patch applies and is in context
> diff format, good. It adds an unrelated newline, which should generally be
> avoided.
>
> Usability review: functional additions look good and usable, certainly
> within a contrib module. I don't think it's compliant to any spec, but
> that's not required for contrib, IMO.
>
> Functional test: works as advertised on the accompanying website. I've
> tested the int_array_append_aggregate somewhat, see [1].
>
> Performance testing: I've tested with 10 mio rows of arrays of size 1
> and compared against core's int_array_aggregate function, see [1] again.
> In this simple test it took roughly 50% longer, which seems okay. Memory
> consumption looks sane as well.
>
> Coding review: style seems fine for contrib, though lines longer than 80
> cols should be broken up. Comments in the code are sparse , some even
> counter-productive (marking something as "additional things" certainly
> doesn't help).
>
> Code and architecture review: the new int_array_append_aggregate()
> functions itself seems fine to me.
>
> Summary: My general feeling is, that this patch should be applied after
> minor code style corrections. As a longer term goal I think intagg should be
> integrated into core, since it's very basic functionality. TODO entries for
> things like an array_accum() aggregate already exist. Adding this patch to
> contrib now might be a step into the right direction.
>
> Dmitry, can you please apply these small corrections and re-submit the
> patch?
>
> Regards
>
> Markus Wanner
>
> P.S.: I dislike the intagg's use of PGARRAY, but that's nothing to do with
> this patch. Shouldn't this better use a real composite type as the
> aggregate's state type? I'd propose to clean up the intagg contrib module
> and prepare it for inclusion into core.
>
>
> [1]: functional and performance testing session:
>
> On a database with (patched) intagg and intarr contrib modules:
>
> markus=# CREATE TABLE test (id INT NOT NULL, arr INT[] NOT NULL);
> CREATE TABLE
> markus=# INSERT INTO test VALUES (1, ARRAY[1,2,3]), (2, ARRAY[4,5]), (3,
> ARRAY[3,2,1]);
> INSERT 0 3
> markus=# SELECT * FROM test;
> id | arr
> ----+---------
> 1 | {1,2,3}
> 2 | {4,5}
> 3 | {3,2,1}
> (3 rows)
>
> markus=# SELECT int_array_append_aggregate(arr) FROM test;
> int_array_append_aggregate
> ----------------------------
> {1,2,3,4,5,3,2,1}
> (1 row)
>
> markus=# SELECT * FROM test;
> id | arr
> ----+---------
> 1 | {1,2,3}
> 2 | {4,5}
> 3 | {3,2,1}
> (3 rows)
>
> markus=# SELECT int_array_aggregate(id) AS ids,
> int_array_append_aggregate(arr) FROM test GROUP BY (id / 2);
> ids | int_array_append_aggregate
> -------+----------------------------
> {1} | {1,2,3}
> {2,3} | {4,5,3,2,1}
> (2 rows)
>
> markus=# SELECT int_array_aggregate(id) AS ids,
> int_array_append_aggregate(arr) FROM test GROUP BY (id % 2);
> ids | int_array_append_aggregate
> -------+----------------------------
> {2} | {4,5}
> {1,3} | {1,2,3,3,2,1}
> (2 rows)
>
> markus=# INSERT INTO test VALUES (4, NULL);
> INSERT 0 1
>
> markus=# SELECT int_array_append_aggregate(arr) FROM test;
> int_array_append_aggregate
> ----------------------------
> {1,2,3,4,5,3,2,1}
> (1 row)
>
> markus=# SELECT id, int_array_append_aggregate(arr) FROM test GROUP BY id;
> id | int_array_append_aggregate
> ----+----------------------------
> 4 | {}
> 2 | {4,5}
> 3 | {3,2,1}
> 1 | {1,2,3}
> (4 rows)
>
> -- switching to performance testing
>
> markus=# \timing
> Timing is on.
>
> markus=# DELETE FROM test;
> DELETE 4
> Time: 9.037 ms
>
> markus=# INSERT INTO test SELECT generate_series(1, 10000000),
> array[round(random() * 100)]::int[];
> INSERT 0 10000000
> Time: 53321.186 ms
>
> markus=# SELECT icount(int_array_aggregate(id)) AS count FROM test;
> count
> ----------
> 10000000
> (1 row)
>
> Time: 2493.184 ms
>
> markus=# SELECT icount(int_array_append_aggregate(arr)) AS count FROM test;
> count
> ----------
> 10000000
> (1 row)
>
> Time: 4152.478 ms
>
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message D'Arcy J.M. Cain 2008-09-07 12:44:51 Re: Noisy CVS updates
Previous Message Simon Riggs 2008-09-07 12:12:53 Re: Synchronous Log Shipping Replication