Re: NAMEDATALEN increase because of non-latin languages

From: Andres Freund <andres(at)anarazel(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, John Naylor <john(dot)naylor(at)enterprisedb(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Денис Романенко <deromanenko(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: NAMEDATALEN increase because of non-latin languages
Date: 2022-06-23 21:49:09
Message-ID: 20220623214909.4liatiztuojano77@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2022-06-23 14:42:17 -0400, Robert Haas wrote:
> On Thu, Jun 23, 2022 at 2:07 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > The extra cost of the deforming step could also be repaid, in some
> > cases, by not having to use SysCacheGetAttr etc later on to fetch
> > variable-length fields. That is, I'm imagining that the deformer
> > would extract all the fields, even varlena ones, and drop pointers
> > or whatever into fields of the C struct.

I was also thinking we'd translate all attributes. Not entirely sure whether
we'd want to use "plain" pointers though - there are some places where we rely
on being able to copy such structs around. That'd be a bit easier with
relative pointers, pointing to the end of the struct. But likely the
notational overhead of dealing with relative pointers would be far higher than
the notational overhead of having to invoke a generated "tuple struct" copy
function. Which we'd likely need anyway, because some previously statically
sized allocations would end up being variable sized?

> Yeah, if we were going to do something like this, I can't see why we
> wouldn't do it this way. It wouldn't make sense to do it for only some
> of the attributes.

Agreed.

> I'm not sure exactly where we would put this translation step, though.
> I think for the syscaches and relcache we'd want to translate when
> populating the cache so that when you do a cache lookup you get the
> data already translated. It's hard to be sure without testing, but
> that seems like it would make this cheap enough that we wouldn't have
> to be too worried, since the number of times we build new cache
> entries should be small compared to the number of times we access
> existing ones. The trickier thing might be code that uses
> systable_beginscan() et. al. directly.

I was thinking we'd basically do it wherever we do a GETSTRUCT() today.

A first step could be to transform code like
(Form_pg_attribute) GETSTRUCT(tuple)
into
GETSTRUCT(pg_attribute, tuple)

then, in a subsequent step, we'd redefine GETSTRUCT as something
#define GESTRUCT(catalog, tuple) tuple_to_struct_##catalog(tuple)

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2022-06-23 22:29:46 Re: NAMEDATALEN increase because of non-latin languages
Previous Message Andres Freund 2022-06-23 21:21:45 Re: SLRUs in the main buffer pool - Page Header definitions