Re: WIP: extensible enums

From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: extensible enums
Date: 2010-10-16 17:25:16
Message-ID: AANLkTinxjqw30v6NDvKZd6KyqaPYV90V0fHTDaKKP9di@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On 10/15/2010 04:33 AM, Dean Rasheed wrote:
>>
>> I started looking at this last night, but ran out of time. I'll
>> continue this evening / over the weekend.

Continuing my review of this patch...

Usability review
----------------

What the patch does:

This patch adds syntax to allow additional enum values to be added to
an enum type, at any position in the list. The syntax has already been
discussed and a broad consensus reached. To me the new syntax seems
very logical and easy to use/remember.

Do we want that?

Yes, I think so. I can think of a few past projects where I could have
used this. A possible future extension would be the ability to remove
enum values, but that is likely to have additional complications, and
I think the number of use cases for it is smaller. I see no reason to
insist that it be part of this patch.

Do we already have it?

No.

Does it follow SQL spec, or the community-agreed behavior?

Not in the SQL spec, but it does follow agreed behaviour.

Does it include pg_dump support (if applicable)?

Yes.

Are there dangers?

None that I can think of.

Have all the bases been covered?

I just noticed that a couple of columns have been added to pg_type, so
there's another bit documentation that needs updating.

Feature test
------------

Does the feature work as advertised?

Yes.

Are there corner cases the author has failed to consider?

None that I could find.

Are there any assertion failures or crashes?

Yes, I'm afraid I managed to provoke a crash by running the regression
tests with -DCATCACHE_FORCE_RELEASE, after spotting a suspicious bit
of code (see below).

Performance review
------------------

Does the patch slow down simple tests?

There is a documented performance penalty when working with enum
values that are not in OID order. To attempt to measure this I created
2 tables, foo and bar, each with 800,000 rows. foo had an enum column
using the planets enum from the regression test, with values not in
OID order. bar had a similar enum column, but with values in the
default OID order.

Performance differences for comparison-heavy operations are most noticable:

SELECT MAX(p) FROM foo;
max
---------
neptune
(1 row)

Time: 132.305 ms

SELECT MAX(p) FROM bar;
max
---------
neptune
(1 row)

Time: 93.313 ms

SELECT p FROM foo ORDER BY p OFFSET 500000 LIMIT 1;
p
--------
saturn
(1 row)

Time: 1516.725 ms

SELECT p FROM bar ORDER BY p OFFSET 500000 LIMIT 1;
p
--------
saturn
(1 row)

Time: 1043.010 ms

(optimised build, asserts off)

That's a bigger performance hit than a I would have expected, even
though it is documented.

enum_ccmp() is using bsearch(), so there's a fair amount of function
call overhead there, and perhaps for small enums, a simple linear scan
would be faster. I'm not sure to what extent this is worth worrying
about.

Code review
-----------

I'm not familar enough with the pg_upgrade code to really comment on it.

Looking at AddEnumLabel() I found it a bit hard to follow, but near
the end of the block used when BEFORE/AFTER is specified, it does
this:

ReleaseCatCacheList(list);

/* are the labels sorted by OID? */
if (result && newelemorder > 1)
result = newOid > HeapTupleGetOid(existing[newelemorder-2]);
if (result && newelemorder < nelems + 1)
result = newOid < HeapTupleGetOid(existing[newelemorder-1]);

It looks to me as though 'existing[...]' is a pointer into a member of
the list that's just been freed, so it risks reading freed memory.
That seems to be confirmed by running the tests with
-DCATCACHE_FORCE_RELEASE. Doing so causes a number of the tests to
fail/crash, but I didn't dig any deeper to confirm that this was the
cause.

For the most part, this patch looks good, but I think there is still a
bit of tidying up to do.

Regards,
Dean

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Oleg Bartunov 2010-10-16 17:42:34 Re: knngist plans
Previous Message Peter Eisentraut 2010-10-16 17:17:48 Re: knngist - 0.8