Re: NAMEDATALEN increase because of non-latin languages

From: Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>
To: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
Cc: Денис Романенко <deromanenko(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: NAMEDATALEN increase because of non-latin languages
Date: 2021-08-18 13:06:12
Message-ID: CAEudQAqOqX0kZv=goGoXhC+LsutcBAXfsLyd_hDK=K_74a0A6A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Em qua., 18 de ago. de 2021 às 09:33, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
escreveu:

> On Wed, 2021-08-18 at 08:16 -0300, Ranier Vilela wrote:
> > Em qua., 18 de ago. de 2021 às 08:08, Денис Романенко <
> deromanenko(at)gmail(dot)com> escreveu:
> > > Hello dear hackers. I understand the position of the developers
> community about
> > > NAMEDATALEN length - and, in fact, 63 bytes is more than enough - but
> only if we
> > > speak about latin languages.
> > >
> > > Postgresql has wonderful support for unicode in table and column
> names. And it
> > > looks like very good idea to create table with names on native
> language for
> > > databases across the world. But when I want to create, for example,
> table with
> > > name "Catalog_Контрагенты_КонтактнаяИнформация" (that stands in
> Russian for
> > > catalog of counteragent contacts) it will be auto-shrinked to
> > > "Catalog_Контрагенты_КонтактнаяИнформ". And this is not a fictional
> problem -
> > > many words in Russian are just longer than it's English counterparts
> and I
> > > have many examples like this.
> > >
> > > Although recompiling the source is not so hard, updating is hard. I
> know that
> > > is not free for disk space because of storing table names and field
> names but,
> > > from my point of view, in 2021 year convenience is more important
> than disk space.
> > >
> > > I ask you to consider increasing NAMEDATALEN for maybe 128 bytes in
> future releases.
>
> My stance here is that you should always use ASCII only for database
> identifiers,
> not only because of this, but also to avoid unpleasant encoding problems if
> you want to do something like
>
> pg_dump -t Catalog_Контрагенты_КонтактнаяИнформация mydb
>
> on a shell with an encoding different from the database encoding.
>
> So I am not too excited about this.
>
> > +1 once that Oracle Database 12.2 and higher, has support for 128 bytes
> names.
> > What possibly, in the future, could impact some migration from Oracle to
> Postgres.
>
> That seems to be a better argument from my point of view.
>
> I have no idea as to how bad the additional memory impact for the catalog
> caches would be...
>
It seems to me that this is a case for macro:
HAS_SUPPORT_NAME_128_BYTES
Деnis Романенко would like and would pay the price for regression in
exchange for the convenience.
What impacts him now is the difficulty of maintaining a private tree, with
this support.

regards,
Ranier Vilela

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Dilger 2021-08-18 13:16:45 Re: Use extended statistics to estimate (Var op Var) clauses
Previous Message Andy Fan 2021-08-18 12:56:54 Re: Table AM modifications to accept column projection lists