Re: PATCH: CITEXT 2.0 v3

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: CITEXT 2.0 v3
Date: 2008-07-12 21:57:27
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

There was some discussion earlier about whether the proposed regression
tests for citext are suitable for use in contrib or not. After playing
with them for awhile, I have to come down very firmly on the side of
"not". I have these gripes:

1. The style is gratuitously different from every other regression test
in the system. This is not a good thing. If it were an amazingly
better way to do things, then maybe, but as far as I can tell the style
pgTAP forces on you is really pretty darn poorly suited for SQL tests.
You have to contort what could naturally be expressed in SQL as a table
result into a scalar. Plus it's redundant with the expected-output file.

2. It's ridiculously slow; at least a factor of ten slower than doing
equivalent tests directly in SQL. This is a very bad thing. Speed of
regression tests matters a lot to those of us who run them a dozen times
per day --- and I do not wish to discourage any developers who don't
work that way from learning better habits ;-)

Because of #1 and #2 I find the use of pgTAP to be a nonstarter.
I have a couple of gripes about the substance of the tests as well:

3. What you are mostly testing is not the behavior of citext itself,
but the behavior of the underlying strcoll function. This is pretty
much doomed to failure, first because the regression tests are intended
to work in multiple locales (and we are *not* giving that up for
citext), and second because the behavior of strcoll isn't all that
portable across platforms even given allegedly similar locale settings
(as we already found out yesterday). Sadly, I think you have to give up
attempts to check the interesting multibyte cases and confine yourself
to tests using ASCII strings.

4. A lot of the later test cases are equally uselessly testing whether
piggybacking over text functions works. The odds of ever finding
anything with those tests are not distinguishable from zero (unless the
underlying text function is busted, which is not your responsibility to
test). So I don't see any point in putting them into the standard
regression package. (What maybe *would* be useful to test, but you
didn't, is whether the result of a function is considered citext rather
than text.)

I suggest something more like the attached as a suitable regression
test. I got bored about halfway through and started to skim, so there
might be a few tests toward the end of the original set that would be
worth transposing into this one, but nothing jumped out at me.

regards, tom lane

Attachment Content-Type Size
unknown_filename text/plain 3.0 KB

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message David E. Wheeler 2008-07-12 22:13:13 Re: PATCH: CITEXT 2.0 v3
Previous Message David E. Wheeler 2008-07-12 21:50:24 Re: PATCH: CITEXT 2.0 v3