Re: A varint implementation for PG?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Craig Ringer <craig(dot)ringer(at)enterprisedb(dot)com>
Subject: Re: A varint implementation for PG?
Date: 2021-08-03 19:39:29
Message-ID: 1773272.1628019569@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> However, I suspect that the whole approach should be completely
> revised for a user-visible data type. On the one hand, there's no
> telling how large a value some user will want to represent, so
> limiting ourselves to 64 bits does seem shortsighted. And on the othe
> hand, if we've got a varlena, we already know the length, so it seems
> like we shouldn't also encode the length in the value. Maybe there's a
> more efficient way, but the first thing that occurs to me is to just
> discard high order bytes that are all zeroes or all ones until the
> high order bit of the next byte doesn't match and plonk the remaining
> bytes into the varlena. To decompress, just sign-extend out to the
> target length. Really, this kind of representation can be extended to
> represent arbitrarily large integers, even bigger than what we can
> currently do with numeric, which is already crazy huge, and it seems
> to have some advantage in that every payload byte contains exactly 8
> data bits, so we don't need to shift or mask while encoding and
> decoding.

+1. I think this, together with our existing rules for varlena headers,
would address the issue quite nicely. Any sanely-sized integer would
require only a one-byte header, so the minimum on-disk size is 2 bytes
(with no alignment padding required). I don't buy that there's enough
need to justify inventing a new typlen code, since even if you did it
wouldn't improve things all that much compared to this design.

(Oh ... actually the minimum on-disk size is one byte, since value zero
would require no payload bytes.)

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-08-03 19:42:57 Re: Commitfest overflow
Previous Message Bruce Momjian 2021-08-03 19:39:06 Re: Commitfest overflow