Re: [PATCH] Refactor *_abbrev_convert() functions

From: John Naylor <johncnaylorls(at)gmail(dot)com>
To: Aleksander Alekseev <aleksander(at)tigerdata(dot)com>
Cc: PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Refactor *_abbrev_convert() functions
Date: 2026-02-03 01:07:59
Message-ID: CANWCAZYHK4F1MPBytEKSS8qhi9kiUXhJTZq-rWcyzk6BCOyfYg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jan 13, 2026 at 7:34 PM Aleksander Alekseev
<aleksander(at)tigerdata(dot)com> wrote:
> Now when all Datums are 64-bit values we can simplify the code by
> using murmurhash64(). This refactoring was previously suggested by
> John Naylor [1].

There's more we can do here. Above the stanzas changed in the patch
there is this, at least for varlena/bytea:

hash = DatumGetUInt32(hash_any((unsigned char *) authoritative_data,
Min(len, PG_CACHE_LINE_SIZE)));

This makes no sense to me: hash_any() calls hash_bytes() and turns the
result into a Datum, and then we just get it right back out of the
Datum again. addHyperLogLog says "typically generated using
hash_any()", but that function takes a uint32, not a Datum, so that
comment should probably be changed. hash_bytes() is global, so we can
use it directly.

if (len > PG_CACHE_LINE_SIZE)
hash ^= DatumGetUInt32(hash_uint32((uint32) len));

Similar here, but instead of hash_bytes_uint32(), we may as well use
mumurhash32().

--
John Naylor
Amazon Web Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2026-02-03 01:40:13 Re: Add expressions to pg_restore_extended_stats()
Previous Message Shinya Kato 2026-02-03 00:43:25 Re: Wake up backends immediately when sync standbys decrease