Re: Review: Patch: Allow substring/replace() to get/set bit values

From: Leonardo F <m_lists(at)yahoo(dot)it>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Review: Patch: Allow substring/replace() to get/set bit values
Date: 2010-01-19 12:08:55
Message-ID: 838145.54566.qm@web29012.mail.ird.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> In the
> documentation, the get_bit and set_bit methods are added to a list
> where we state "The following SQL-standard functions work on bit
> strings as well as character strings"; however they are not
> SQL-standard functions and are implemented on binary strings, not
> character strings.

Ok, I didn't change that, but I guess I can fix it.

> The tests don't include any examples where the bit
> string lines up with a byte boundary or is operating on the first or
> last bit of a byte, and the overlay function is not tested with
> values which change the length of the bit string. (All of these
> cases seem to work in the current version, but seem like the sort of
> thing which could get broken in revision.)

Ok, I'll add some corner-cases tests.

> There is a peculiar omission from the patch -- it adds supportfor the
> overlay function to bit strings but not binary strings. Sincethe
> standard drops the bit string as a standard type in the 2003standard,
> in favor of boolean type and binary string types, it seemsodd to omit
> binary string support (which is defined in the standard)but include
> bit string support (which isn't). What the patch adds seemsuseful,
> and I don't think the omission should be considered fatal, butit
> would be nice to see us also cover binary strings if we can.

Mmh, don't know how complicated that would be, but I can give it a try:
I guess it could be a simple SQL replacement (as for char strings and bit
strings?)

> Well, there was one thing which is beyond my ken
> in this regard: I'm not sure that the cases from bits8 to (unsigned
> char *) are needed or appropriate, although on my machine they don't
> seem to hurt. Perhaps someone with a better grasp of such issues can
> comment.

That's some kind of copy&paste from varlena.c byteaGetBit: I don't know
if they can be avoided.

> I'm not sure the leading whitespace in pg_proc.h is as it should be.

Do you mean the tab in:

_null_(tab)"select ...

?

Yes, I think I should remove it.

> I'd prefer to see the comments for get_bit and set_bit mention that
> the location is specified left-to-right in a zero-based fashion
> consistent with the other get_bit and set_bit functions, but
> inconsistent with the standard substring, position, overlay
> functions. Personally, I find it unfortunate that our extensions
> introduced this inconsistency, but let's keep it consistent within
> the similarly named functions.

Ok; I found them bizarre too, but I kept it consistent.

> It seems odd to me that you specify the bit to set as a 32 bit
> integer, and that that is what get_bit returns, but again, this is
> consistent with what we do for the bytea functions, so it's probably
> the right thing to match that.

Same as above: I tried to be consistent.

> Using a ERRCODE_ARRAY_SUBSCRIPT_ERROR for a bit position which is
> out-of-bounds seems somewhat bizarre to me -- I'd probably have
> chosen ERRCODE_INVALID_PARAMETER_VALUE in a green field -- but again,
> it matches the bytea implementation.

Again, same as above: consistency; but I can change it if you want

> This last point is probably more a matter of C coding style than
> anything, and I'm way to rusty in C to want to be the final arbiter
> of something like this, but it seems to me that oldByte and newByte
> local variables are just cluttering things up rather than clarifying:
>
> int oldByte,
> newByte;
> [...]
> oldByte = ((unsigned char *) r)[byteNo];
>
> if (newBit == 0)
> newByte = oldByte & (~(1 << bitNo));
> else
> newByte = oldByte | (1 << bitNo);
>
> ((unsigned char *) r)[byteNo] = newByte;
>
> vs
>
> if (newBit == 0)
> ((unsigned char *) r)[byteNo] &= (~(1 << bitNo));
> else
> ((unsigned char *) r)[byteNo] |= (1 << bitNo);
>

I agree, you version is cleaner.

Leonardo

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2010-01-19 14:12:21 Re: Clearing global statistics
Previous Message Boszormenyi Zoltan 2010-01-19 11:01:24 Re: Fix auto-prepare #2