> On 10/15/2010 04:33 AM, Dean Rasheed wrote:
>> I started looking at this last night, but ran out of time. I'll
>> continue this evening / over the weekend.
Continuing my review of this patch...
What the patch does:
This patch adds syntax to allow additional enum values to be added to
an enum type, at any position in the list. The syntax has already been
discussed and a broad consensus reached. To me the new syntax seems
very logical and easy to use/remember.
Do we want that?
Yes, I think so. I can think of a few past projects where I could have
used this. A possible future extension would be the ability to remove
enum values, but that is likely to have additional complications, and
I think the number of use cases for it is smaller. I see no reason to
insist that it be part of this patch.
Do we already have it?
Does it follow SQL spec, or the community-agreed behavior?
Not in the SQL spec, but it does follow agreed behaviour.
Does it include pg_dump support (if applicable)?
Are there dangers?
None that I can think of.
Have all the bases been covered?
I just noticed that a couple of columns have been added to pg_type, so
there's another bit documentation that needs updating.
Does the feature work as advertised?
Are there corner cases the author has failed to consider?
None that I could find.
Are there any assertion failures or crashes?
Yes, I'm afraid I managed to provoke a crash by running the regression
tests with -DCATCACHE_FORCE_RELEASE, after spotting a suspicious bit
of code (see below).
Does the patch slow down simple tests?
There is a documented performance penalty when working with enum
values that are not in OID order. To attempt to measure this I created
2 tables, foo and bar, each with 800,000 rows. foo had an enum column
using the planets enum from the regression test, with values not in
OID order. bar had a similar enum column, but with values in the
default OID order.
Performance differences for comparison-heavy operations are most noticable:
SELECT MAX(p) FROM foo;
Time: 132.305 ms
SELECT MAX(p) FROM bar;
Time: 93.313 ms
SELECT p FROM foo ORDER BY p OFFSET 500000 LIMIT 1;
Time: 1516.725 ms
SELECT p FROM bar ORDER BY p OFFSET 500000 LIMIT 1;
Time: 1043.010 ms
(optimised build, asserts off)
That's a bigger performance hit than a I would have expected, even
though it is documented.
enum_ccmp() is using bsearch(), so there's a fair amount of function
call overhead there, and perhaps for small enums, a simple linear scan
would be faster. I'm not sure to what extent this is worth worrying
I'm not familar enough with the pg_upgrade code to really comment on it.
Looking at AddEnumLabel() I found it a bit hard to follow, but near
the end of the block used when BEFORE/AFTER is specified, it does
/* are the labels sorted by OID? */
if (result && newelemorder > 1)
result = newOid > HeapTupleGetOid(existing[newelemorder-2]);
if (result && newelemorder < nelems + 1)
result = newOid < HeapTupleGetOid(existing[newelemorder-1]);
It looks to me as though 'existing[...]' is a pointer into a member of
the list that's just been freed, so it risks reading freed memory.
That seems to be confirmed by running the tests with
-DCATCACHE_FORCE_RELEASE. Doing so causes a number of the tests to
fail/crash, but I didn't dig any deeper to confirm that this was the
For the most part, this patch looks good, but I think there is still a
bit of tidying up to do.
In response to
pgsql-hackers by date
|Next:||From: Oleg Bartunov||Date: 2010-10-16 17:42:34|
|Subject: Re: knngist plans|
|Previous:||From: Peter Eisentraut||Date: 2010-10-16 17:17:48|
|Subject: Re: knngist - 0.8|