| From: | Nathan Bossart <nathandbossart(at)gmail(dot)com> | 
|---|---|
| To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> | 
| Cc: | adam(at)labkey(dot)com, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org | 
| Subject: | Re: BUG #18711: Attempting a connection with a database name longer than 63 characters now fails | 
| Date: | 2024-11-20 16:50:57 | 
| Message-ID: | Zz4TcYSNWW1_Vj--@nathan | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-bugs | 
On Wed, Nov 20, 2024 at 11:29:56AM -0500, Tom Lane wrote:
> Nathan Bossart <nathandbossart(at)gmail(dot)com> writes:
>> Upthread, you mentioned that we could bypass multiple lookups unless both
>> the NAMEDATALEN-1'th and NAMEDATALEN-2'th bytes are non-ASCII.  But if
>> there are encodings with the high bit set that don't require multiple bytes
>> per character, then how can we do that?
> 
> Well, we don't know the length of the hypothetically-truncated
> character, but if there was one then all its bytes must have had their
> high bits set.  Suppose that the untruncated name has a 4-byte
> multibyte character extending from the NAMEDATALEN-3 byte through the
> NAMEDATALEN'th byte (counting in origin zero here):
>
> [...]
> 
> Now as for the shortcut cases: if C3 does not have the high bit set,
> it cannot be part of a multibyte character.  Therefore the original
> encoding-aware truncation would have removed C3 and following bytes,
> but no more.  The character immediately before might have been one
> byte or several, but it doesn't matter.  Similarly, if C2 does not
> have the high bit set, it cannot be part of a multibyte character.
> The original truncation would have removed C3 and following bytes,
> but no more.
Oh, I think I had an off-by-one error in my mental model and was thinking
of the NAMEDATALEN-1'th byte as the last possible byte in the identifier
(i.e., name[NAMEDATALEN - 2]), whereas you meant the location where the
trailing zero would go for the largest possible all-ASCII identifier (i.e.,
name[NAMEDATALEN - 1]).  Thank you for elaborating.
-- 
nathan
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Mark Hill | 2024-11-20 21:42:00 | Just tried to build Postgres 17 on AIX | 
| Previous Message | Tom Lane | 2024-11-20 16:29:56 | Re: BUG #18711: Attempting a connection with a database name longer than 63 characters now fails |