Tatsuhito Kasahara <kasahara(dot)tatsuhito(at)oss(dot)ntt(dot)co(dot)jp> writes:
> make_greater_string() does not return a string when some UTF8 strings
> set to str_const.
> # Especially UTF8 strings which contains 'BF' in last byte.
The patch you propose for this is really untenable: it will re-introduce
many corner cases that we got rid of years ago, for example cases
wherein pg_verifymbstr and pg_mbcliplen index off the end of the string
because they think the last character occupies more bytes than are
there. It's intentional that the existing code doesn't mess with the
first byte of a multibyte character (which is the one that determines
the character length, in all encodings of interest).
Another problem is that if the last character is several bytes long,
this coding would cause us to iterate through potentially many millions
of character values before giving up and truncating off the last
character. In a large number of cases that's just wasted time because
there is no chance of getting a larger string without incrementing some
character further to the left. So there's a tradeoff that limits how
many values we should consider for each character position --- choosing
to consider at most 255 is a bit arbitrary, but "all of them" isn't
going to work.
I don't think that the set of cases that could be improved this way is
large enough to justify trying to find solutions to these problems.
regards, tom lane
In response to
pgsql-bugs by date
|Next:||From: Ken Alverson||Date: 2010-06-21 17:26:43|
|Subject: BUG #5518: MS crash - can duplicate every time|
|Previous:||From: Tom Lane||Date: 2010-06-21 15:51:14|
|Subject: Re: BUG #5516: Memory grows up problem |