| From: | John Naylor <johncnaylorls(at)gmail(dot)com> |
|---|---|
| To: | Aleksander Alekseev <aleksander(at)tigerdata(dot)com> |
| Cc: | PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: [PATCH] Refactor *_abbrev_convert() functions |
| Date: | 2026-02-03 01:07:59 |
| Message-ID: | CANWCAZYHK4F1MPBytEKSS8qhi9kiUXhJTZq-rWcyzk6BCOyfYg@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Tue, Jan 13, 2026 at 7:34 PM Aleksander Alekseev
<aleksander(at)tigerdata(dot)com> wrote:
> Now when all Datums are 64-bit values we can simplify the code by
> using murmurhash64(). This refactoring was previously suggested by
> John Naylor [1].
There's more we can do here. Above the stanzas changed in the patch
there is this, at least for varlena/bytea:
hash = DatumGetUInt32(hash_any((unsigned char *) authoritative_data,
Min(len, PG_CACHE_LINE_SIZE)));
This makes no sense to me: hash_any() calls hash_bytes() and turns the
result into a Datum, and then we just get it right back out of the
Datum again. addHyperLogLog says "typically generated using
hash_any()", but that function takes a uint32, not a Datum, so that
comment should probably be changed. hash_bytes() is global, so we can
use it directly.
if (len > PG_CACHE_LINE_SIZE)
hash ^= DatumGetUInt32(hash_uint32((uint32) len));
Similar here, but instead of hash_bytes_uint32(), we may as well use
mumurhash32().
--
John Naylor
Amazon Web Services
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Michael Paquier | 2026-02-03 01:40:13 | Re: Add expressions to pg_restore_extended_stats() |
| Previous Message | Shinya Kato | 2026-02-03 00:43:25 | Re: Wake up backends immediately when sync standbys decrease |