From: | "David E(dot) Wheeler" <david(at)kineticode(dot)com> |
---|---|
To: | Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: PATCH: CITEXT 2.0 |
Date: | 2008-07-03 04:38:32 |
Message-ID: | 55AA0795-AA11-4D44-B7F6-90B3A933365F@kineticode.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Jul 2, 2008, at 09:13, Zdenek Kotala wrote:
> I went through your code and I have following comments/questions:
Thanks. I'm on a short family vacation, so it will probably be Monday
before I can submit a new patch. I got a few replies below, though.
> 1) formating
>
> Please, do not type space before asterix:
> char * lcstr, * rcstr -> char *lcstr, *rcstr
>
> and do not put extra space in a brackets
> citextcmp( left, right ) -> citextcmp(left, right)
Okay.
> Other seems to me OK (remove 8.2 version mention at the bottom)
Um, are you referring to this (at the top):
+// PostgreSQL 8.2 Magic.
+#ifdef PG_MODULE_MAGIC
+PG_MODULE_MAGIC;
+#endif
That's gotta be there (though I can of course remove the comment).
> 2) citextcmp function returns int but citext_cmp returns int32. I
> think citextcmp should returns int32 as well.
Yeah, I'm a bit confused by this. I followed what's in varlena.c on
this. The comparison functions are documented supposed to return 1, 0,
or -1, which of course would be covered by int. But varstr_cmp(),
which ultimately returns the value, returns all kinds of numbers. So,
perhaps someone could say what it's *supposed* to be?
> 3) citext_larger, citext_smaller function have memory leak. You
> don't call PG_FREE_IF_COPY on unused text.
>
> I'm not sure if return value in these function should be duplicated
> or not.
These I also duplicated from varlena.c, and I asked about a potential
memory leak in a previous email. If there's a leak here, would there
not also be a leak in varlena.c?
> 4) Operator = citext_eq is not correct. See comment http://doxygen.postgresql.org/varlena_8c.html#8621d064d14f259c594e4df3c1a64cac
So should citextcmp() call strncmp() instead of varst_cmp()? The
latter is what I saw in varlena.c.
> There must be difference between equality and collation for example
> in Czech language 'láska' and 'laská' are different word it means
> that 'láska' != 'laská'. But there is no difference in collation
> order. See Unicode Universal Collation Algorithm for detail.
I'll leave the collation stuff to the functions I call (*far* from my
specialty), but I'll add a test for this and make sure it works as
expected. Um, although, with what collation should it be tested? The
tests I wrote assume en_US.UTF-8.
> 5) There are several commented out lines in CREATE OPERATOR
> statement mostly related to NEGATOR. Is there some reason for that?
I copied it from the original citext.sql. Not sure what effect it has.
> Also OPERATOR || has probably wrong negator.
Right, good catch.
> 6) You use pgTAP for unit test. It is something what should be
> widely discussed. It is new approach and I'm not correct person to
> say if it good or not.
Sure. I created a pgFoundry project for it here:
http://pgtap.projects.postgresql.org/
> I see there two problems:
> At first point there is not clear licensing (https://svn.kineticode.com/pgtap/trunk/README.pgtap
> ).
That's a standard revised BSD license.
> And, It should be implemented as a general framework by my opinion
> not only as a part of citext contrib module.
It is. I just put the file in citext so that the tests can use it.
It's distributed on pgFoundry and maintained at the SVN URL quoted in
the file.
> Please, let me know when you will have updated code and I will start
> deep testing.
Will do.
Best,
David
From | Date | Subject | |
---|---|---|---|
Next Message | David E. Wheeler | 2008-07-03 04:39:13 | Re: PATCH: CITEXT 2.0 |
Previous Message | Ken Camann | 2008-07-03 04:13:55 | Re: A Windows x64 port of PostgreSQL |