Re: [PATCH] Refactor *_abbrev_convert() functions

From: Aleksander Alekseev <aleksander(at)tigerdata(dot)com>
To: PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Cc: John Naylor <johncnaylorls(at)gmail(dot)com>
Subject: Re: [PATCH] Refactor *_abbrev_convert() functions
Date: 2026-02-03 14:56:16
Message-ID: CAJ7c6TPZVKk1Dn7LWRatBi=UFYfEPmfGGFYT8wQzDj7y8oJ_CQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi John,

Many thanks for your feedback!

> There's more we can do here. Above the stanzas changed in the patch
> there is this, at least for varlena/bytea:
>
> hash = DatumGetUInt32(hash_any((unsigned char *) authoritative_data,
> Min(len, PG_CACHE_LINE_SIZE)));
>
> This makes no sense to me: hash_any() calls hash_bytes() and turns the
> result into a Datum, and then we just get it right back out of the
> Datum again.

I see similar patterns in files other than bytea.c and varlena.c.
Implemented as a separate patch.

> if (len > PG_CACHE_LINE_SIZE)
> hash ^= DatumGetUInt32(hash_uint32((uint32) len));
>
> Similar here, but instead of hash_bytes_uint32(), we may as well use
> mumurhash32().

Ditto.

It's worth noting that timetz_hash() uses a similar pattern but I
choose to keep it as is. Changing it will break backward
compatibility. Also it breaks our tests.

Using hash_bytes_uint32() / hash_bytes_uint32_extended() directly in
timetz_hash() / timetz_hash_extended() is safe though. Proposed as a
separate patch.

> addHyperLogLog says "typically generated using
> hash_any()", but that function takes a uint32, not a Datum, so that
> comment should probably be changed. hash_bytes() is global, so we can
> use it directly.

Makes sense. Implemented as an independent patch.

--
Best regards,
Aleksander Alekseev

Attachment Content-Type Size
v2-0004-Improve-the-comment-for-addHyperLogLog.patch text/x-patch 1.5 KB
v2-0005-Avoid-unnecessary-type-casting-in-timetz_hash-tim.patch text/x-patch 1.6 KB
v2-0003-Use-murmurhash32-instead-of-hash_uint32-where-app.patch text/x-patch 2.2 KB
v2-0002-Avoid-unnecessary-type-casting-when-using-hash_an.patch text/x-patch 8.3 KB
v2-0001-Refactor-_abbrev_convert-functions.patch text/x-patch 4.6 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Florents Tselai 2026-02-03 15:07:18 Re: Add LIMIT option to COPY FROM
Previous Message Tom Lane 2026-02-03 14:56:03 Re: Get rid of the pre-C11 _Alignof define in zic.c