Re: NAMEDATALEN increase because of non-latin languages

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
Cc: Julien Rouhaud <rjuju123(at)gmail(dot)com>, Денис Романенко <deromanenko(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: NAMEDATALEN increase because of non-latin languages
Date: 2021-08-18 14:21:03
Message-ID: 1428154.1629296463@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

John Naylor <john(dot)naylor(at)enterprisedb(dot)com> writes:
> The main thing I'm worried about is the fact that a name would no longer
> fit in a Datum. The rest I think we can mitigate in some way.

Not sure what you mean by that? name is a pass-by-ref data type.

Anyway, this whole argument could be rendered moot if we could convert
name to a variable-length type. That would satisfy *both* sides of
the argument, since those who need long names could have them, while
those who don't would see net reduction instead of growth in space usage.

Of course, this is far far easier said than done; else we would have
done it years ago. But maybe it's not entirely out of reach.
I do not think it'd be hard to change "name" to have the same on-disk
storage representation as cstring; the hard part is what about its
usage in fixed-width catalog structures. Maybe we could finesse that
by decreeing that the name column always has to be the last
non-CATALOG_VARLEN field. (This would require fixing up the couple of
places where we let some other var-width field have that distinction;
but I suspect that would be small in comparison to the other work this
implies. If there are any catalogs having two name columns, one of them
would become more difficult to reach from C code.)

Another fun thing --- and, I fear, another good argument against just
raising NAMEDATALEN --- is what about TupleDescs, which last I checked
used an array of fixed-width pg_attribute images. But maybe we could
replace that with an array of pointers. Andres already did a lot of
the heavy code churn required to hide that data structure behind
TupleDescAttr() macros, so changing the representation should be much
less painful than it would once have been.

I wonder if we'd get complaints from changing the catalog column layouts
that much. People are used to the name at the front, I think. OTOH,
I expected a lot of bleating about the OID column becoming frontmost,
but there hasn't been much.

Anyway, I have little desire to work on this myself, but I recommend that
somebody who is more affected by the name length restriction look into it.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2021-08-18 14:23:34 Re: .ready and .done files considered harmful
Previous Message Greg Nancarrow 2021-08-18 13:28:20 Re: Parallel scan with SubTransGetTopmostTransaction assert coredump